Easy Tutorial
❮ Es6 Reflect Proxy Programmer Joke 25 ❯

The "Last Line Effect" in Programming

Category Programming Techniques

My name is Andrey Karpov, and I have studied hundreds of errors caused by "copy-paste." It is a certainty that programmers often make mistakes in the last segment of a long block of code. It seems that no programming books have discussed this phenomenon, so I decided to write about it myself. I call it the "Last Line Effect."

The Last Line Effect

When programming, programmers often need to write a series of similar structures. Typing each line manually is tedious and inefficient. This is why they use the "copy-paste" method: a piece of code is copied and pasted several times, then modified. Everyone knows the downside of this: it's easy to forget to modify something after pasting, leading to issues. Unfortunately, there often isn't a better method.

So, what pattern did I discover? I found that errors often occur in the last pasted block of code.

Here is a brief example:

inline Vector3int32& operator+=(const Vector3int32& other) {
  x += other.x;
  y += other.y;
  z += other.y;
  return *this;
}

Notice this line: "z += other.y;". The programmer forgot to replace 'y' with 'z'.

You might think this is a hypothetical example, but it actually comes from a real application. Next, I will show you that this is a common error. Programmers frequently make this mistake at the end of a series of similar operations.

I've heard that climbers often fall in the last few meters. It's not because they are tired, but because they are too excited about reaching the end, they become careless and slip. I suspect programmers are similar.

Now, let's look at some data.

After studying the database, I identified 84 code segments generated by "copy-paste." In 41 of these segments, the error occurred in the middle of the pasted blocks. For example:

strncmp(argv[argidx], "CAT=", 4) &&
strncmp(argv[argidx], "DECOY=", 6) &&
strncmp(argv[argidx], "THREADS=", 6) &&
strncmp(argv[argidx], "MINPROB=", 8)) {

The "THREADS=" string is 8 characters long, not 6.

In the other 43 segments, the error occurred in the last pasted block.

Of course, 43 is not much more than 41. But note that a program may have many similar code blocks, so errors can occur in the first, second, fifth, or even tenth block. Therefore, in other blocks, we have a relatively even distribution, while the last block has a peak.

On average, there are 5 similar code blocks.

So, the first 4 blocks have 41 errors evenly distributed, averaging 10 errors per block.

However, the last block has 43 errors!

The following distribution chart highlights this phenomenon:

Distribution Chart of Errors in Five Similar Code Blocks

Thus, we can conclude a rule:

The probability of errors in the last pasted block is 4 times higher than in other blocks.

This rule may not be universally applicable. It's an interesting finding, and its practical value is to remind you to stay vigilant when writing the last block.

Examples:

Next, I will demonstrate that this is not just my imagination but a real trend. Here are some examples.

Of course, I won't list all examples, only simple and representative ones.

Source Engine SDK

inline void Init( float ix=0, float iy=0,
                  float iz=0, float iw = 0 ) 
{
  SetX( ix );
  SetY( iy );
  SetZ( iz );
  SetZ( iw );
}

The last line should be SetW().

Chromium

if (access & FILE_WRITE_ATTRIBUTES)
  output.append(ASCIIToUTF16("\tFILE_WRITE_ATTRIBUTES\n"));
if (access & FILE_WRITE_DATA)
  output.append(ASCIIToUTF16("\tFILE_WRITE_DATA\n"));
if (access & FILE_WRITE_EA)
  output.append(ASCIIToUTF16("\tFILE_WRITE_EA\n"));
if (access & FILE_WRITE_EA)
  output.append(ASCIIToUTF16("\tFILE_WRITE_EA\n"));
break;

The last two lines are identical.

ReactOS

if (*ScanString == L'\"' ||
    *ScanString == L'^' ||
    *ScanString == L'\"')

Multi Theft Auto

class CWaterPolySAInterface
{
public:
    WORD m_wVertexIDs[3];
};
CWaterPoly* CWaterManagerSA::CreateQuad (....)
{
  ....
  pInterface->m_wVertexIDs [ 0 ] = pV1->GetID ();
  pInterface->m_wVertexIDs [ 1 ] = pV2->GetID ();
  pInterface->m_wVertexIDs [ 2 ] = pV3->GetID ();
  pInterface->m_wVertexIDs [ 3 ] = pV4->GetID ();
  ....
}

The last line is redundant code due to habitual pasting. The array size is 3.

Source Engine SDK

intens.x=OrSIMD(AndSIMD(BackgroundColor.x,no_hit_mask),
                AndNotSIMD(no_hit_mask,intens.x));
intens.y=OrSIMD(AndSIMD(BackgroundColor.y,no_hit_mask),
                AndNotSIMD(no_hit_mask,intens.y));
intens.z=OrSIMD(AndSIMD(BackgroundColor.y,no_hit_mask),
                AndNotSIMD(no_hit_mask,intens.z));

The programmer forgot to change "BackgroundColor.y" to "BackgroundColor.z" in the last line.

Trans-Proteomic Pipeline

void setPepMaxProb(....)
{  
  ....
  double max4 = 0.0;
  double max5 = 0.0;
  double max6 = 0.0;
  double max7 = 0.0;
  ....
  if ( pep3 ) { ... if ( use_joint_probs && prob > max3 ) ... }
  ....
  if ( pep4 ) { ... if ( use_joint_probs && prob > max4 ) ... }
  ....
  if ( pep5 ) { ... if ( use_joint_probs && prob > max5 ) ... }
  ....
  if ( pep6 ) { ... if ( use_joint_probs && prob > max6 ) ... }
  ....
  if ( pep7 ) { ... if ( use_joint_probs && prob > max6 ) ... }
  ....
}

The programmer forgot to change "prob > max6" to "prob > max7" in the last condition.

SeqAn

inline typename Value<Pipe>::Type const & operator*() {
  tmp.i1 = *in.in1;
  tmp.i2 = *in.in2;
  tmp.i3 = *in.in2;
  return tmp;
}

SlimDX

for( int i = 0; i < 2; i++ )
{
  sliders[i] = joystate.rglSlider[i];
  asliders[i] = joystate.rglASlider[i];
  vsliders[i] = joystate.rglVSlider[i];
  fsliders[i] = joystate.rglVSlider[i];
}

The last line should use rglFSlider.

Qt

if (repetition == QStringLiteral("repeat") ||
    repetition.isEmpty()) {
  pattern->patternRepeatX = true;
  pattern->patternRepeatY = true;
} else if (repetition == QStringLiteral("repeat-x")) {
  pattern->patternRepeatX = true;
} else if (repetition == QStringLiteral("repeat-y")) {
  pattern->patternRepeatY = true;
} else if (repetition == QStringLiteral("no-repeat")) {
  pattern->patternRepeatY = false;
  pattern->patternRepeatY = false;
} else {
  //TODO: exception: SYNTAX_ERR
}

The last block is missing 'patternRepeatX'. The correct code should be:

pattern->patternRepeatX = false;
pattern->patternRepeatY = false;

ReactOS

const int istride = sizeof(tmp[0]) / sizeof(tmp[0][0][0]);
const int jstride = sizeof(tmp[0][0]) / sizeof(tmp[0][0][0]);
const int mistride = sizeof(mag[0]) / sizeof(mag[0][0]);
const int mjstride = sizeof(mag[0][0]) / sizeof(mag[0][0]);

'mjstride' is always equal to 1. The last line should be:

const int mjstride = sizeof(mag[0][0]) / sizeof(mag[0][0][0]);

Mozilla Firefox

if (protocol.EqualsIgnoreCase("http") ||
    protocol.EqualsIgnoreCase("https") ||
    protocol.EqualsIgnoreCase("news") ||
    protocol.EqualsIgnoreCase("ftp") ||          <<<---
    protocol.EqualsIgnoreCase("file") ||
    protocol.EqualsIgnoreCase("javascript") ||
    protocol.EqualsIgnoreCase("ftp")) {          <<<---

The final "ftp" is suspicious; it has already been compared earlier.

Quake-III-Arena

if (fabs(dir[0]) > test->radius ||
    fabs(dir[1]) > test->radius ||
    fabs(dir[1]) > test->radius)

The value of dir[2] was forgotten to be checked.

Clang

return (ContainerBegLine <= ContaineeBegLine &&
        ContainerEndLine <= ContaineeEndLine &&
        (ContainerBegLine != ContaineeBegLine ||
         SM.getExpansionColumnNumber(ContainerRBeg) <=
         SM.getExpansionColumnNumber(ContaineeRBeg)) &&
        (ContainerEndLine != ContaineeEndLine ||
         SM.getExpansionColumnNumber(ContainerREnd) >=
         SM.getExpansionColumnNumber(ContainerREnd)));

In the last block, the expression "SM.getExpansionColumnNumber(ContainerREnd)" is being compared to itself.

MongoDB

bool operator==(const MemberCfg& r) const {
  ....
  return _id==r._id && votes == r.votes &&
         h == r.h && priority == r.priority &&
         arbiterOnly == r.arbiterOnly &&
         slaveDelay == r.slaveDelay &&
         hidden == r.hidden &&
         buildIndexes == buildIndexes;
}

The programmer forgot the "r" at the last line.

Unreal Engine 4

static bool PositionIsInside(....)
{
  return
    Position.X >= Control.Center.X - BoxSize.X * 0.5f &&
    Position.X <= Control.Center.X + BoxSize.X * 0.5f &&
    Position.Y >= Control.Center.Y - BoxSize.Y * 0.5f &&
    Position.Y >= Control.Center.Y - BoxSize.Y * 0.5f;
}

In the last line, the programmer made two mistakes. First, ">=" should be changed to "<=", and second, the minus sign should be changed to a plus sign.

Qt

qreal x = ctx->callData->args[0].toNumber();
qreal y = ctx->callData->args[1].toNumber();
qreal w = ctx->callData->args[2].toNumber();
qreal h = ctx->callData->args[3].toNumber();
if (!qIsFinite(x) || !qIsFinite(y) ||
    !qIsFinite(w) || !qIsFinite(w))

In the last qIsFinite, the parameter passed should be 'h'.

OpenSSL

if (!strncmp(vstart, "ASCII", 5))
  arg->format = ASN1_GEN_FORMAT_ASCII;
else if (!strncmp(vstart, "UTF8", 4))
  arg->format = ASN1_GEN_FORMAT_UTF8;
else if (!strncmp(vstart, "HEX", 3))
  arg->format = ASN1_GEN_FORMAT_HEX;
else if (!strncmp(vstart, "BITLIST", 3))
  arg->format = ASN1_GEN_FORMAT_BITLIST;

The string "BITLIST" has a length of 7, not 3.

That's enough. The examples I've given should be sufficient to illustrate the issue, right?

Conclusion

This article tells you that the "copy-paste" method is four times more likely to fail in the last pasted code block than in other blocks.

This is related to human psychology, not technical proficiency. The article demonstrates that even top programmers in projects like Clang or Qt can make such mistakes.

I hope this discovery is helpful to programmers and might encourage them to study our bug database. I believe this could help in finding new patterns in these errors and formulating new programming advice.

This article is reprinted from: http://www.vaikan.com/the-last-line-effect/

** Click to Share Notes

Cancel

-

-

-

❮ Es6 Reflect Proxy Programmer Joke 25 ❯