Java String Pool

What will the following lines of code print?

1  String a = "abc";
2  String b = "abc";
3  System.out.println(a == b);
4  String c = new String("abc");
5  String d = new String("abc");
6  System.out.println(c == d);

In order to test for equality between two objects, you must use the equals method.  Otherwise, if you test the equality between two variables that point to objects, you’re comparing the value it stores; in this case the address that reference the objects.  As a result, both println statements should both return false, right?  Why does the first println statement (line 3) print true?

It turns out that variables a and b both point to the same object.  This is because both variables reference string literal objects.  A string literal is instead stored in a special area of the Java PermGen heap space called the String Pool.  The String Pool acts as a cache for string literals; If you create a string literal that already exists in the String Pool, Java is smart enough to use the existing reference.

On line 1, when I set the variable a to the literal “abc”, it will store that value in the String Pool.  When I do the same again for variable b, on line 2, it will find the same value in the String Pool and use the reference to the existing object.  Not only does this save memory, but the values of the variables a and b are both equal since they reference the same object.

On the other hand, if we create a String using the new operator (line 4), we will first find the “abc” reference that exists in the String Pool.  This reference will be passed into the String constructor to instantiate a completely new object that now lives in the Java heap space.  After we create two instances of the String object using the new operator (line 5), we have two completely separate instances of the String object.  Therefore the variables c and d reference two completely separate variables.

As you can see, it almost always never makes sense to use the new String(String) constructor.  If for whatever reason you must use this constructor, you may use the String.intern() method to place the object in the String Pool. This method will then return the  reference to the object that lives in the String Pool.

Given what you have just learned, what will the following lines of code print?

System.out.println(c.intern() == d.intern());
System.out.println("a" + "bc" == "ab" + "c");
System.out.println("abcd" == "abc" + "d");

Gawker Password Database Cracking

If you haven’t heard already, Gawker Media has been hacked! Long story short: The Gnosis crew has taken credit for breaking into Gawker Media’s servers and downloaded a database of roughly 1.3 million user records. One of the big stories here was that Gawker stored user passwords using the block cipher DES (Data Encryption Standard). This encryption scheme only encrypts the first 8 characters of a password before storing it in the database. This means ‘password1sSecureBec@useIt’sLong_4nd_uses_5pecial_characters’ will only be stored as DES(‘password’).

My senior project in college involved using the distributed computing framework Hadoop to crack passwords, so naturally this piqued my interest. Unfortunately I don’t have access to a cluster of computers, so instead I wanted to take the opportunity to learn something new (Cuda comes to mind). Before I jump into playing with Cuda, I first wanted to get a feel for what I was doing by writing some Python code to get me started.

First thing I had to do was filter out the records in the database that were not valid or didn’t have an encrypted password. Of the 1,248,120 records there were only 748,508 that I considered valid. Each password was encrypted using a salt, so I couldn’t simply create a rainbow table of encrypted passwords and do a lookup. Instead I had to crack each password individually. I decided to crack the passwords using a dictionary-based attack. I used a list of the 500 most commonly use passwords as my initial dictionary (The most popular passwords being first). I figured this would be my best bet for quickly shortening the list passwords that I would need to crack. The jist of how this works is the following:
1. Read in a word from the list.
2. Iterate through each account and compute the encrypted value using the user’s given salt.
3. If the computed value is the same as the encrypted password, then we have the original password.
4. If the password was cracked, remove it from the list so we don’t have to worry about it in future computations.

As of this writing I’m at the 39th word and I’ve managed to shorten the list by 16,512 users, or by 0.02%. This doesn’t seem to be going fast enough, so perhaps I’ll look more into doing this with something faster such as C or Cuda. The point of this exercise was really to get myself familiar with the process of cracking these passwords. I was able to quickly write some code in Python to test my ideas, and I quickly found out what ideas worked and what didn’t.

Starting out with Nmap

Nmap is one of those programs that I can’t live without.  Whether I’m using it to learn more about my own network, or somebody else’s network, it really gives me a better picture of what other devices are on the network without actually having to actually physically see the device.

Scanning your network

One of the basic ways to discover devices is to use a ping sweep with Nmap.  This sends out a ping to a single device or a range of devices. It used to be that hosts that dropped ICMP echo requests (ping) would up show as being down in Nmap. It seems that newer versions of Nmap are able to detect if a host is up if it doesn’t respond to a ping. (Try ping microsoft.com and see what it responds with.)

Pinging a single device

nmap -sP 192.168.0.1

Results

Nmap scan report for 192.168.0.1
Host is up (0.055s latency).

Pinging a range of IP addresses

nmap -sP 192.168.0.1-5

Results

Nmap scan report for 192.168.0.1
Host is up (0.0046s latency).
Nmap scan report for 192.168.0.3
Host is up (0.016s latency).
Nmap scan report for 192.168.0.5
Host is up (0.000098s latency).

Pinging the entire network

nmap -sP 192.168.0.0/24

Results

Nmap scan report for 192.168.0.1
Host is up (0.055s latency).
Nmap scan report for 192.168.0.3
Host is up (0.014s latency).
Nmap scan report for 192.168.0.5
Host is up (0.00029s latency).
Nmap scan report for 192.168.0.100
Host is up (0.0090s latency).

The last command will send a ping out to any device that has their IP address starting with 192.168.0.x.  (The /24 is short-hand for the subnet mask of 255.255.255.0.)

Device Enumeration

Once we have a list of devices on our network we may want to know some information about it.  Simply typing the command nmap along with the IP address or name of the device, will return what ports are open on the device and guess what service is running on the device.  For example, when I run nmap on my own machine I get the following:

nmap 192.168.0.5

Results

PORT     STATE SERVICE
22/tcp   open  ssh
88/tcp   open  kerberos-sec
139/tcp  open  netbios-ssn
445/tcp  open  microsoft-ds
515/tcp  open  printer
631/tcp  open  ipp
3689/tcp open  rendezvous

What if I want some more information like what version of ssh am I running?  Use the -sV flag.

nmap -sV 192.16.0.5

Results

PORT     STATE SERVICE      VERSION
22/tcp   open  ssh          OpenSSH 5.2 (protocol 2.0)
88/tcp   open  kerberos-sec Mac OS X kerberos-sec
139/tcp  open  netbios-ssn  Samba smbd 3.X (workgroup: WORKGROUP)
445/tcp  open  netbios-ssn  Samba smbd 3.X (workgroup: WORKGROUP)
515/tcp  open  printer
631/tcp  open  ipp          CUPS 1.4
3689/tcp open  daap         Apple iTunes DAAP 9.0.3
Service Info: OS: Mac OS X

Also notice we get some extra information such as the OS I’m running and the workgroup my samba shares are on.

Now let’s take what we learned and put them together.  How would I get information on all devices on my network?

nmap -sV 192.168.0.0/24

The output of this is too long for here, but I think you get the idea; it’ll display all the service information (along with service versions) for all hosts on the network.

That’s the basics of using nmap.  With the ideas presented here, you should be able to find out most information about about devices on the network you are currently connected to.

Most listened to music of the year

According to last.fm, this is the final count of the bands I listened to the most this year.

  1. Nine Inch Nails (528 Plays) – Nine Inch Nails is and has been one of my favorite bands for about 13 years now. NIN has always been the goto band for me for when I don’t feel like listening to anything else. This year I finally got around to actually listening to Year Zero. I was not enthusiastic about this CD when it first came out, but after seeing them live at the Santa Barbara Bowl this year, I finally got around to listening to it and love it! I actually found that it made for good running music! On top of that, I rediscovered what has to be one of my favorite albums: The Fragile. This CD is up there with Dark Side of the Moon in terms of replayability.
  2. NOFX (431 Plays) - Another one of goto bands when I don’t feel like listening to anything else. NOFX has been another one of my favorite bands for about 11 years now and I’m still not tired of them. This year they release coaster which has been in regular rotation in my library. This summer in particular I have also been re-listening to a lot of their albums when I went on my runs.
  3. The Appleseed Cast (329 Plays) - The Appleseed Cast have been one of my favorite bands of recent times. This year they released Sagarmatha which I absolutely feel in love with. The first track “As The Little Things Go” has to be my favorite track of the year. I will hopefully get to see them live for the first time in 2010.
  4. Placebo (291 Plays) - Again, another one of my favorite bands that I started to around 9 years ago. This year they released “Battle for the Sun” and did it without their original drummer Steven Hewitt. Despite his absense, Placebo released this slightly popier sounding album, but still staying true to their original sound. I have yet to see this band live; this needs to change!
  5. Muse (203 Plays) - This year muse released “The Resistance”, but I’ve also been rediscovering their CDs such as “Origin of Symmetry” and “Absolution”. I also recently watched the movie “Southland Tales” (which I won’t go into here), but it featured the song “Blackout” off of Absolution that I really never listened to. Holy crap what a great track!
  6. Asobi Seksu (181 Plays) - This band is a Japanese/American Indie/Pop band that I admit is a guilty pleasure. This year I’ve been listening to their self-titled album and “Citrus” mostly. This year they came out with the CD “Hush,” but it hasn’t gotten as much playtime as the other two albums from me.
  7. Pixies (163 Plays) - I had my pixies phase around six years ago. I think now was the time to come back and check out my old favorites. Not much to say here really.
  8. Steve Aoki (148 Plays) - Introduced to me randomly by a friend in SLO in August, Steve Aoki has been a running favorite this year.  Steve Aoki kind of reminds me of Girl Talk which I probably listened to A LOT last year.
  9. The Protomen (142 Plays) - The concept of a rock opera based on the Mega Man video game series just cannot fail. I was a big fan of the first act, but when act II came out, I listened to both acts back to back. Both CDs have a unique feel to them; the first act is more inspired by the videogame series (with a hint of Ennio Morricone) while the second act follows up with a sound inspired by the 80s.  Someone needs to turn these albums into an anime!
  10. Pink Floyd (140 Plays) - Pink Floyd is not one of those bands where you pick and choose their greatest hits. Instead when you listen to Pink Floyd, you listen to them an entire album at a time. In particular I remember listening to The Wall, Animals, and Dark Side of the Moon this year.

Find the Largest Prime Factor

I love programming competitions.  In the past two years I’ve participated in both my school’s programming competition and the ACM Southern California regional competition.  Recently I’ve been turned onto http://projecteuler.net which is a competition-like site that gives you a series of math questions which you need to solve by writing a program.  I saw this as a good opportunity to brush up on python, so today I’ve been trying my hand at a few of these problems.

Question #3 asks you to find the largest prime factor of a very large number.  My first instinct was to start at Sqrt(n) and work my way back until I find the first factor.  The problem here is that the first factor you find may not be the first PRIME factor.  Instead, my approach to this problem was to find all factors of the number between 2 and Sqrt(n) and put them in in a list.

Of all the numbers in the list, one of them is the largest prime factor, but which one?  To find out, I made an assumption that turned out to be right:  The composite factors have at least one prime factor that is already in the list.  If I can show that a given number has no factors in the list, then it is a prime factor.  Further, if I have a sorted list and start at the end, I can show that the number is the largest prime factor.

Anyway, here’s the code I used:

n = 600851475143 # Find the largest prime factor of this
div = [x for x in range(2,int(math.sqrt(n))) if n % x == 0] # Get all factors
i, j = 0, len(div) – 1
while j > i: # Determine which one of the factors is the largest prime
# This number is composite, go to the next largest factor
if div[j] % div[i] == 0: i,j = 0,j-1
# See if the next number at the beginning of the list divides the current number
else: i += 1
print div[j] # Print out the largest prime factor
n = 600851475143 # Find the largest prime factor of this
div = [x for x in range(2,int(math.sqrt(n))) if n % x == 0] # Get all factors
i, j = 0, len(div) - 1
while j > i: # Determine which one of the factors is the largest prime
# This number is composite, go to the next largest factor
if div[j] % div[i] == 0: i,j = 0,j-1
# See if the next number at the beginning of the list divides the current number
else: i += 1
print div[j] # Print out the largest prime factor

Fix for CodeIgniter White Screen on Bluehost

This past weekend I spent writing an application using the CodeIgniter framework for my programming languages class.  Sunday night I finally got it working and it was time to move it to production on bluehost.  After I made created the database and made some configuration changes I checked it out and I got nothing.  It was a white page of nothing.  I checked my logs and found nothing.  Normally this wouldn’t be such an issue, but when you have nothing in your logs, then you don’t have much to work with.  I looked at the source of the blank page and found this: “<!– SHTML Wrapper – 500 Server Error –>.”  Still not a lot to work with.

Today I spent some time investing the issue and found out that a lot of people were having the issue, but I didn’t find any fixes.  Long story short the fix was to enable FastCGI PHP5 on Bluehost.  When I did that an error finally appeared on the screen! Apparently I misspelled my database name >_<.

Anyway, if you’re having the same issue, let me know if this fix worked for you.

PHP Frameworks

I’ve been coding in PHP off and on for years now.  Each project I’ve done thus far has always been by hand.  Doing the architecture of your site by hand is the easiest way to introduce bugs and end up with sloppy code.   For the first time I’ve started a project where I used a framework in PHP.  My framework? Codeigniter.  I like it’s simplicity, MFC design and how it’s not like Rails.  I thought Rails was a bit too complicated and involved hacking the internal code way too much to get anything done, but that’s a whole other rant.

Anyway, the project I’m working on?  Just a twitter clone.  Why?  For the sake of learning, that’s all.  It’s actually for a project in school.  The idea is to implement this idea in something that I’m already familiar with (PHP) and then implement it again in another language I’m not familiar with at all.  After I’m doing with the first part I want to implement it in Python using the Django framework.  Python has always been one of those languages I’ve been meaning to learn.  In fact it seems I try to learn it every summer, but I just don’t seem to have the dedication to sit in doors learning it.

Regardless, I hope to use this knowledge in my professional life.  The last PHP project I worked on was stressful just because of some of the design decisisons I made (Doing a half-assed MVC implementation…didn’t do the best job of dividing up the view and controller).  However, I’ve learned from those mistakes and I might be taking up a new project at work where I’ll probably be using this framework.

Using SSL with Gmail and Gmail Notifier

With the recent announcement at Defcon, Gmail users will soon be the target of session hijacking. The reason for this is that Gmail by default does not encrypt any traffic (except logins). This allows anyone on the local network to sniff for session ids passed between gmail and the user when you check your email.  With this session id, a hijacker can act authenticate themselves as you without the need for your username and password.

This has always been an issue for non-encrypted traffic, but it was announced at defcon that a tool has been released that automates this hack.  This was enough reason for Google to release an option to turn on SSL.  The problem here is that you still have to manually turn it on.

To turn on SSL go to “Settings” and at the bottom you’ll see an option called “Browser Connection.”  Choose the option “Always use https.”  Yay now we’re protected from session hijacking!

ssl.png
The problem I next noticed was that gmail notifier stopped working!  After doing some investigating, I found that you had to do some hex editing with gnotify.exe to get it to use SSL.

Before we do anything, close and make a backup copy of gnotify.exe just in case anything happens.  By default you can find this executable in C:\Program Files\Google\Gmail Notifier.  For hex editing I used an old favorite hex editor called “Hex Workshop” for Windows.  After you download/install it, open up gnotify.exe in Hex Workshop.  On the left you’ll see a bunch of hexidecimal characters and on the right you’ll see the ASCII equivalent.

To make the replaecment, we need to first find the area we want to modify.  To find the area, hit CTRL-F and let’s do a search for the string we want to modify: “http://”.  Under the “Type” drop-down choose “Text String.”  When you find “http://mail.google.com/mail/” go ahead and add a “s” after “http.”  You’ll see that whenever you type, it will overwrite whatever was in that field before.  Go ahead and type out the replaced characters until you end up with “https://mail.google.com/mail/”.

hex.png

Go ahead and save the modified executable and open it back up.  If it fails, you can always use the backup you made!  Otherwise, you should know have access to gmail over an encrypted connection!

I’m sure I’ll be writing a part II to this when I get home and my gmail notifier isn’t working there either.

Explosions in the Sky

How do we hear about new music? You might say the radio or MTV. Let’s be serious though, MTV doesn’t play music videos, not anymore at least. As far as the radio goes, I can only stand listening to the same 10 songs over and over again only so much. What about friends? Word of mouth is a great way to hear about new bands. If you know that one of your friends has similar music taste, you’re more inclined to listen to a new band per their recommendation. Now with the internet you have an additional resource to aid in discovering new music.

Last.fm is a site I’ve probably talked about before (whether on this blog or in person). It combines the last two ways of finding music: your friends and the internet. I’m discovered many bands such as Silversun Pickups and Mogwai based off of what my friends were listening to. One of the other reasons I love last.fm is how they recommend new bands. Not only do they let you see and read about bands who are similar to bands you already love, but they let you hear those bands too. I believe I was looking for artists similar to Mogwai when I stumbled upon “Explosions in the Sky” on the similar artist radio station. After one listen of “The Birth and Death of the Day” I was hooked.

You might say that I’m just going through a phase, which is true. I usually go through phases where I’ll listen to the same artist over and over again until I find the next one (Sigur Ros, Stars, Broken Social Scene were some of my most recent phases). Right now I’m on my Explosions in the Sky kick and I look forward to discovering other Post-Rock bands.

For all of you HTML hackers out there

I’m working on a new project at work which involves the use of Microsoft SQL Server and DTS packages…something I’m not familiar with.  As always, if I don’t know something I google it.

One of the sites I came across was this one. You’ll see that there is a pop-under that is blocking the content.  See if you can use Firebug to get rid of the pop-under to see the rest of the content.

Hint: After you get rid of the pop-under, you still can’t see all of the content.  Check the div style properties!