Kip

What’s wrong with special characters?

Written by Kip on Monday, August 25, 2008 at 2:20 pm (EDT)
Tagged as:

Here is a message I got after logging into a website recently:

** NOTE ** Using a colon (“:”) in your password can create problems when logging in to Banner Self Service. If your password includes a colon, please change it using the PWManager link below.

Protip: If you are designing any kind of login/authentication system and you find that you need to give users a warning similar to this, you are doing something wrong.

On a much more nitpicky side note, why not just make “PWManager” or “using the PWManager” link to PWManager?  To their credit, at least they didn’t say “by clicking the PWManager link below.”

No Comments
Kip

Code excerpt

Written by Kip on Monday, May 19, 2008 at 3:05 pm (EDT)
Tagged as:

Do programmers still need to understand pointers in this day and age?  Is Java a good programming language?  Why is there something as opposed to nothing?

Please do not indulge in heated debate pertaining to any of the above unanswerable philosophical questions.  Instead, just let me show you what can happen when someone has to program in Java without understanding two things about the Java language: 1) all parameters are always passed by value; 2) objects are basically pointers (whose address can only be modified by assignment).  Without this knowledge, you might write code like this1:

            //ABC-defect#123456 - SetChildren update currSel so we need to take bakup of
                //currSel so that after returning we can set it back to original currSel
           TreeNode tmpCurrSel = currSel;
       int result = setChildren(currSel,fullTree,0,tmpLevels);
          if (tmpCurrSel != null)
                currSel=tempCurrentSelection; //ABC-defect#123456

Hint: it does not work as the developer expected it to (and obviously he never tested it after he coded it).

1 Code has been anonymized, but indentation and grammar is preserved for EnhancedRealism™.
No Comments
Kip

Macrolicious

Written by Kip on Thursday, March 6, 2008 at 4:03 pm (EST)
Tagged as:

I recently came across a clever way of writing preprocessor macros, and I figured that I would share.

Let’s say that for some reason you need to write a macro: MACRO(X,Y)1.  You want this macro to emulate a function call in every way2.

Example 1: This should work as expected.
if (x > y)
  MACRO(x, y);
do_something();

Example 2: This should not result in a compiler error.
if (x > y)
  MACRO(x, y);
else
  MACRO(y - x, x - y);

Example 3: This should not compile.
do_something();
MACRO(x, y)
do_something();

The naïve way to write the macro is like this:

#define MACRO(X,Y)                       \
cout << "1st arg is:" << (X) << endl;    \
cout << "2nd arg is:" << (Y) << endl;    \
cout << "Sum is:" << ((X)+(Y)) << endl;

This is a very bad solution which fails all three examples, and I shouldn’t need to explain why.

Now, the way I most often see macros written is to enclose them in curly braces, like this:

#define MACRO(X,Y)                         \
{                                          \
  cout << "1st arg is:" << (X) << endl;    \
  cout << "2nd arg is:" << (Y) << endl;    \
  cout << "Sum is:" << ((X)+(Y)) << endl;  \
}

This solves example 1, because the macro is in one statement block.  But example 2 is broken because we put a semicolon after the call to the macro.  This makes the compiler think the semicolon is a statement by itself, which means the else statement doesn’t correspond to any if statement!  And lastly, example 3 compiles OK, even though there is no semicolon, because a code block doesn’t need a semicolon.

The solution is kind of clever, I thought:

#define MACRO(X,Y)                         \
do {                                       \
  cout << "1st arg is:" << (X) << endl;    \
  cout << "2nd arg is:" << (Y) << endl;    \
  cout << "Sum is:" << ((X)+(Y)) << endl;  \
} while (0)

Now you have a single block-level statement, which must be followed by a semicolon.  This behaves as expected and desired in all three examples.  I have noticed this macro pattern before, but I never really thought about why it was written this way.  Mainly because I don’t often write macros to begin with.

1 You should first ask yourself why you can’t just write a regular function and declare it inline, so that the compiler will do the work for you.  I’m going to assume there is some good reason why you must use a macro.
2 Every way, that is, except that it can’t return a value.  That gets much trickier and involves heavy abuse of the ?: operator, if it is even possible at all.
No Comments
Kip

Doh!

Written by Kip on Thursday, February 14, 2008 at 1:23 pm (EST)
Tagged as:

I knew it would happen eventually.  I put in some code that broke our software, and it wasn’t discovered until nearly a month later, on the day the final build was scheduled.  This meant that the final build had to be delayed for a few days, which is kind of a big deal because it can affect ship date.  So lots of e-mails were circulated which featured my name—often in a red, boldface font—in various lists of actions.

Posted below is a paraphrased version of the code in question.  I’ve renamed or taken out anything that might refer to our internal codebase, and I’ve simplified a little, but not to the point that I look like a complete idiot for missing this.  The QueryInterface() and Release() stuff might look a little weird if you’re not familiar with COM+.  Or all of this will look weird if you’re not a programmer.  But odds are you’re about to stop reading if you aren’t a programmer.

LIST(IUnknown) listObjs;
Session->GetModifiedObjects(listObjs);

const int nbObjs = listObjs.Size();

if (nbObjs > 0)
{
   ObjectID** listObjIds = new ObjectID*[nbObjs];
   for (int i = 1; i <= nbObjs; i++)
   {
      ObjectID* pObjId = NULL;
     
      IUnknown* pUnknown = listObjs[i];
      IPart* pPart = NULL;
      if (pUnknown != NULL)
      {
        RC = pUnknown->QueryInterface(IID_IPart, (void **) &pPart);
        pUnknown->Release(); pUnknown = NULL;
      }
      if (SUCCEEDED(RC) && pPart != NULL)
      {
        RC = pPart->get_ObjectID(pObjId);
        pPart->Release(); pPart = NULL;
      }
     
      listObjIds[i] = pObjId;
   }
   ...
   //process listObjIds
   ...
   for (int i = 1; i <= nbObjs; i++)
   {
      if (listObjIds[i] != NULL) { delete listObjIds[i];  listObjIds[i] = NULL; }
   }
   delete [] listObjIds;  listObjIds = NULL;
}

For those of you still with me, maybe you already see the problem.  The LIST() macro in our infrastructure behaves pretty similarly to a Vector in Java: it will resize itself dynamically, check array bounds, and automatically free memory when it is destroyed.  However, because this was written long ago by guys with a Fortran background, the items in the list start at 1, whereas a C++ array starts at 0.  Also, it only works with components implementing IUnknown; plain-old-C++ objects must be handled with plain-old-C++ arrays.  In the code above, this meant I could not declare listObjIds as an object of type LIST(ObjectID*).  So I had a LIST(IUnknown) and an array of ObjectID*s, but I treated both as LISTs!  In fact, I have gotten so used to using LISTs in C++ that I completely forgot that listObjIds was an array (I guess a better variable name would have helped too).

The line listObjIds[i] = pObjId; should instead be written listObjIds[i-1] = pObjId;, since i loops from 1 to n, rather than 0 to n-1.  (Note that the line IUnknown* pUnknown = listObjs[i]; is still correct.)  So I was writing beyond the memory allocated to the listObjIds array.  And amazingly, it worked just fine in all my testing.  Most of the time, the next sizeof(void*) bytes on the heap aren’t going to belong to anyone.  But there is a chance that they are used for some other variable, whose value you would be overwriting.  This is especially more likely if memory has become very fragmented.

We run unit tests on all four operating systems we support, but only one operating system (HP-UX) was affected by this.  And since we don’t currently have any customers using that OS, it was a while before anyone looked at the traces very closely.  Unfortunately, it happens that this code was implemented in a listener that is called every single time the user saves.  So when it was discovered, it was something that had to be fixed before the final build.  We could have delivered it as a patch, but some customers are reluctant to deploy patches because that can mean shutting down production for a few hours.  Plus it doesn’t instill confidence to say we shipped broken code.  So delaying the final build for a day or two was the best option.

The worst part of it all is that it happened just before year-end performance reviews.  Doh!

Kip

Array-casting in Java

Written by Kip on Friday, October 19, 2007 at 1:23 pm (EDT)
Tagged as:

Since I haven’t posted anything this week, I figured I’d share something annoying I discovered in Java: you can’t assume that you can put an object of type T into a T array (unless you happen to know that T is declared as a final class).

Take for example this code, which tries to put an Integer (which is an Object) into an array of Objects:

  public static void main(String[] args)
  {
    Object[] objects = new String[2];
    objects[0] = "ABC";
    objects[1] = new Integer(5);
  }

This code compiles with no problem but when run it gives a runtime error on the objects[1]= line.  But if the array were declared as new Object[2]; it would run with no complaints.

The problem is that you’re allowed to cast an array of type T to an array of a super-type of T, but you don’t really have an array of the super-type.  I imagine they decided to allow this because of the usefulness of casting arrays to super-types for reading the data.  But it opens up a whole new set of bugs that most of the time you wouldn’t even think to check for (especially if the array is declared in someone else’s code).

Apparently C# has controversially included the same feature.

Kip

//it’s in the comments

Written by Kip on Thursday, September 27, 2007 at 2:13 pm (EDT)
Tagged as:

Sure coming up with a great algorithm can be fun, but programmers only get true freedom to express themselves in their comments.  And it’s always fun when you are fixing a bug and run across a funny or ironic comment that you don’t recall typing but you can tell without a doubt that you were the one that wrote it.  While looking through some code I wrote about a year ago, I ran across these comments and thought I’d share.

long numSubLists = (numPRC + (MAX_SUBLIST_SIZE - 1))/MAX_SUBLIST_SIZE; //this math is right.. work it out if you don't believe me. :)

Assert(numReturned == returnSize); //this could only be false if something is broken hard...

PS: assertions are awesome.  I’ve found nothing else to be better at preventing future programmers (especially myself) from breaking my code.

Kip

Java overload

Written by Kip on Friday, January 12, 2007 at 10:21 am (EST)
Tagged as:

Something I learned today: overloaded method resolution in Java is done at compile-time, not runtime.  Warning: if you have no idea what half the words in that sentence meant, then you probably don’t care about the rest of this post.

Let’s say Alice (who is in tons of hacking books for some reason) wrote this code:

  public class Alice
  {
    public static long roundToNearestFive(long n)
    {
      return Math.round(n / 5.0) * 5L;
    }
  }

And let’s say Bob (who totally has a thing for Alice even though she says they are just friends) wrote this code:

  public class Bob
  {
    public static void main(String[] args)
    {
      System.out.println(Alice.roundToNearestFive(12.7);
    }
  }

When Bob runs his code it will print 10, since 12.7 will be silently converted to an integer (12) to be passed into roundToNearestFive(), and 12 is closer to 10 than to 15.  Bob could call roundToNearestFive( Math.round(12.7)) to fix this, but that is annoying because now he has to first round his floating-point numbers before passing them into a rounding function.  So Bob asks Alice to provide a fix, and she adds a version of the function which takes a floating-point number:

  public class Alice
  {
    public static long roundToNearestFive(long n)
    {
      return Math.round(n / 5.0) * 5L;
    }
    public static long roundToNearestFive(double d)
    {
      return Math.round(d / 5.0) * 5L;
    }
  }

She sends a new .jar file to Bob with the change, and he runs his code again, expecting it to now output 15.  But it still prints 10.

The problem is that when Bob compiled his code, there was no roundToNearestFive(double) function available.  So the compiler generated bytecode that looked something like Parent.roundToNearestFive( (long)12.7).  So even when he runs with Alice’s new code in place, the bytecode is still forced to call the integer version of the function.  The only solution for Bob is to recompile his code against the new .jar file sent from Alice.

For further reference, here is the spec for binary compatibility in Java.  And here is more information about Alice and Bob.

No Comments
Kip

Kip rambles about bad programmers again

Written by Kip on Monday, November 13, 2006 at 5:31 pm (EST)
Tagged as:

Here is a database query that has a potentially huge problem:

  select * from users where username = '$username' and password = '$password'

If you’re not a programmer, bear with me, I’m sure you can still follow the problem here.  In the line above, $username contains the value the user gave for their username, and $password contains the value given for their password.  Let’s say my username is “kip” and my password is “12345”.  That gives us:

  select * from users where username = 'kip' and password = '12345'

So far so good, a database can execute that just fine.  But what if my password is “My dog’s name is spot”?  That gives us this:

  select * from users where username = 'kip' and password = 'My dog's name is spot'

See the problem?  The database will think the password is just “My dog”, since there is a single-quote in the password.  It will additionally not know how to handle the rest of the statement and probably return an error, preventing the user from ever logging in.

Nothing I’ve said here of this should be news to a programmer.  In introductory programming courses, students are often asked to write a program where the user is asked for input (let’s say, a number from 1-10), and the program must not fail if the user enters something entirely different (let’s say, “judicious”).  What is happening in my example is in no way fundamentally different.

If you’re thinking to yourself, “Hey Kip... you’re not writing this post because you just figured this out.. are you?”, rest assured that I am not.  I am writing this because (a) I like to pretend that my blog has more than a dozen readers; and (b) because I have seen several sites discussing this type of bug lately.  The implication is that many programmers—presumably the paid, professional types (not just amateurs)—would put user input inside single-quotes without entertaining the possibility that the user might enter text with single quotes in it.  It seems like one of those things that you shouldn’t need to be taught—you should logically know to validate user input, even if you have never received formal training in programming.

Thus far, I haven’t even talked about the security hole caused by this code:  someone could intentionally use a single-quote in their password to exploit this bad code.  For just one of many examples, giving a password of “‘ or ‘abc’ = ‘abc” will let you into any existing user’s account (this is called SQL Injection).  I can understand why a programmer might not see that security hole immediately.  But the security hole is just an abuse of a bug that a logical human being should have seen in the first place.

</soapbox>

Kip

Kip rambles about programming

Written by Kip on Tuesday, September 26, 2006 at 7:08 pm (EDT)
Tagged as:

A few weeks ago, I had to explain inheritance to someone who has been working with software for over a decade.  Sure, he may not have learned about inheritance back when he got his degree, and he may have spent some of his career in sales traveling around the world without writing code.  But for the last three or four years, his job—you know, the thing that pays for his kids’ food—has been to write and maintain Java code.  I expected him to have picked up on this whole object-oriented thing by now.  Since it’s a fundamental concept of Java and all.

Here’s a very simplified version of the code in question, formatted to be as brief as possible:

class A {
  protected boolean isFooRequired() {  return true;  }
  public void doSomething() {
    if (isFooRequired())
      foo();
  }
}

I suggested he fix a bug by adding isFooRequired() to a subclass:

class B extends A {
  protected boolean isFooRequired() {  return false;  }
}

He didn’t understand how the line “if (isFooRequired())” would know to call the isFooRequired() method in class B for an instance of B.  It’s called polymorphism.  Look it up.

That got me thinking about my own software development knowledge.  I work in a field where half of my technical knowledge will be obsolete in five years, and I have probably learned almost as much since graduating as I did in school, so how much good did my degree really do me?  A lot of what I know I have learned from websites and blogs for developers.  I guess they are kinda filling the niche that industry magazines used to fill.  In ascending order of usefulness, the sites I visit most would be The Old New Thing, The Daily WTF, A List Apart, and especially Joel On Software.  And please let me know if I’m missing out on any good ones.  So anyway I’ve learned a lot of things that they just don’t teach you at school, or that you would never want to learn in a classroom (i.e. having a test where they ask if you should require developers to write code in the interview would be dumb).  But there are many things that I wasn’t really taught, or that I was taught only through an elective.  Regular expressions have been extremely useful in my two years of professional programming, yet I only learned them in a one-hour Perl course.  I was never taught closures, and I only learned a functional programming language (Lisp) in an elective (Artificial Intelligence).  Not that I have ever used Lisp in the real world, but closures are nice and allow programmers to do some powerful things very easily.  Java and PHP kinda have things that are sort of like closures, but not really.  Perl has them, but I never write any code in Perl complex enough to need them.  I was never taught databases—I strayed from the databases elective since I knew that if I took it, I would list it on my resume, and I was afraid that listing the course would land me a job as a database administrator.  And I did not want to be a DBA.  In my software engineering course, we went over some concepts that are very important, but don’t really make sense to be tested on—it might have been more effective as a required series of lectures or something, provided the lecturers did a half-decent job.  I know I would have payed more attention that way.  Design patterns were covered in that course very briefly, although knowledge of them has been invaluable to me in the real world, and I was asked about them on nearly every job interview I went through.

I guess where I’m going with this rant is that I make an effort keep up with the latest and greatest.  It is something I am interested in (which is one of the reasons I chose to major in Computer Science in the first place), and it is vital to me being good at what I do.  Shouldn’t I expect others in this field to make at least a minimal effort to do the same?

Kip

URL rewriting

Written by Kip on Monday, May 29, 2006 at 6:00 pm (EDT)
Tagged as:

My site now uses URL Rewriting.  It was surprisingly simple to set up.  I found this overview covered all the basics, and a few Google queries gave me this useful cheat sheet that covered pretty much everything else I needed.  Both of them are kind of aimed at people who don’t know how to use regular expressions, but I do know how to use them so I can’t vouch for how well they explain them.

Any old links you have to my site should work (for now), but you may want to update them.  In particular, the RSS feeds have changed:

Kip’s feed:  http://www.vacant-nebula.com/rss/kip/
Stephanie’s feed:  http://www.vacant-nebula.com/rss/stephanie/
Both:  http://www.vacant-nebula.com/rss/

Like I said, the old URL will work for a while, but I’m not making any promises for what will happen two months from now.

No Comments
RSS feeds: Kip's - Stephanie's - Both
Admin