What Chords Should you Learn First?

Introduction

Have you ever wondered what the ten most common guitar chords in western music are? Well, here they are, and unfortunately, you still have to learn that damned F.

# CHORD PERCENT OF ALL
1 G 14.0854%
2 C 11.5644%
3 D 11.3339%
4 A 7.8566%
5 F 5.9005%
6 Am 4.9978%
7 E 4.9817%
8 Em 4.518%
9 Dm 2.1277%
10 Bm 2.0687%

If you read the rest, I’ve posted the top 100 chords below, and a method to repeat my experiment with your own data set.

Background

As a programmer and amateur guitarist, I have often used online resources to improve my playing.  However, I’ve always seen a few questions the internet seems unable to answer satisfactorily.  Key among these was one of the first questions I ever asked anyone who played guitar: “What chords should I learn first?” To the skilled guitarist, this seems like an easy answer: “All of them.”, but to a new player, who wants to gain a love for playing, they need a strong starting point, from which they can learn new songs, to keep them interested and growing in skill. Some places seemed to recite the standard litany: Learn C, G, D, Am, & Em, and you’ll have the majority of pop music (if you capo).  But that also didn’t seem sufficient for me to pick up a chord-sheet from the internet, and start playing.  They all seemed to need chords I didn’t know yet.  Frankly, It was a frustrating time for me, that I just had to tough out. Last week I started back to work on my sheet music software, “Repertoire”, and decided to add chord renderings.  Obviously, I couldn’t predefine all 10,000+ chord forms available to a skilled player.  But, I certainly could add the 100 or so most common first-position chords.  This lead me to the same merry question I started with: “What are they?”

Methodology

Luckily, this time around I had a few extra resources available to me.  I could code, and, more importantly, I had a 4mb folder of chordpro-formatted chord sheets in a folder on my computer, from http://getsome.org/guitar/olga/chordpro/.   I was working in PHP already, so I just wrote up a quick script:

<?php
function listFilesRecursive($dir, $extension){
    $array = array();
    $ffs   = scandir($dir);
    foreach($ffs as $ff){
        if($ff != '.' && $ff != '..'){
            $path = $dir.'/'.$ff;
            
            if(is_dir($path)) {
                $contents = listFilesRecursive($path, $extension);
                $array = array_merge($array, $contents);
            } else {
                $info = pathinfo($path);
                if(strtolower($info['extension']) == $extension) {
                    $array[] = $path;
                }
            }
        }
    }
    return $array;
}
$directory = "chords";
$files     = listFilesRecursive($directory, "chopro");
$chords    = array();
$total     = 0;
foreach($files as $file) {
    $matches = null;
    $contents = file_get_contents($file);
    preg_match_all('/\[(.*?)\]/', $contents, $matches);
    if(isset($matches[1]) && $matches[1]) {
        foreach($matches[1] as $chord) {
            $chord = str_replace(array('maj', 'min'), array('', 'm'), $chord);
            
            if(!isset($chords[$chord])) $chords[$chord] = 0;
            $chords[$chord]++;
            $total++;
        }
    }
}
arsort($chords);

echo "<table>\n";
echo "<thead><tr><th>#</th><th>Chord</th><th>Times Seen</th></th><th>Percent</th></tr></thead><tbody>\n";
$i=0;
foreach($chords as $chord=>$count) {
    $percent = round($count/$total*100, 4);
    echo "<tr><td>".$i."</td><td><b>".$chord."</b></td><td>".$count."</td><td>".$percent."%</td></tr>\n";
    $i++;
}
echo "</tbody></table>\n";
?>

The script needs to be in a folder, with a subfolder called “chords”, that contains files with the .chopro extension.  It will scan all the files in the subfolder (and all its subfolders) and extract the chords.  Then it counts occurrences of each chord, and produces an ordered list from most common to least.

Results

Here are the top 100 chords (of almost 1300) returned from processing my data set:

# Chord Times Seen Percent
0 G 21019 14.0854%
1 C 17257 11.5644%
2 D 16913 11.3339%
3 A 11724 7.8566%
4 F 8805 5.9005%
5 Am 7458 4.9978%
6 E 7434 4.9817%
7 Em 6742 4.518%
8 Dm 3175 2.1277%
9 Bm 3087 2.0687%
10 B 2950 1.9769%
11 Bb 2786 1.867%
12 G7 2027 1.3584%
13 A7 1880 1.2598%
14 D7 1832 1.2277%
15 F#m 1790 1.1995%
16 E7 1480 0.9918%
17 C7 1479 0.9911%
18 Am7 1275 0.8544%
19 C#m 1246 0.835%
20 F# 1222 0.8189%
21 Eb 1023 0.6855%
22 Gm 996 0.6674%
23 B7 973 0.652%
24 Em7 921 0.6172%
25 F7 824 0.5522%
26 Dm7 817 0.5475%
27 Ab 596 0.3994%
28 Cm 577 0.3867%
29 Bm7 537 0.3599%
30 C# 511 0.3424%
31 D/F# 454 0.3042%
32 Gm7 447 0.2995%
33 G#m 367 0.2459%
34 G# 363 0.2433%
35 C/G 361 0.2419%
36 Fm 355 0.2379%
37 F#m7 334 0.2238%
38 G/B 321 0.2151%
39 F#7 270 0.1809%
40 G6 264 0.1769%
41 Asus4 259 0.1736%
42 Bb7 249 0.1669%
43 Cm7 228 0.1528%
44 D# 223 0.1494%
45 C9 209 0.1401%
46 Hm 206 0.138%
47 C/B 184 0.1233%
48 Dsus4 184 0.1233%
49 H7 180 0.1206%
50 A# 179 0.12%
51 Db 177 0.1186%
52 C/E 172 0.1153%
53 D9 150 0.1005%
54 Bbm 149 0.0998%
55 Gb 148 0.0992%
56 Asus2 146 0.0978%
57 C#m7 146 0.0978%
58 Esus4 143 0.0958%
59 G/F# 142 0.0952%
60 Dsus 141 0.0945%
61 Cadd9 139 0.0931%
62 G/D 130 0.0871%
63 D/A 127 0.0851%
64 A/C# 120 0.0804%
65 N.C. 109 0.073%
66 G5 104 0.0697%
67 Dsus2 99 0.0663%
68 C#7 99 0.0663%
69 A5 97 0.065%
70 E/G# 92 0.0617%
71 Ebm 91 0.061%
72 G9 91 0.061%
73 F/G 87 0.0583%
74 D6 87 0.0583%
75 Eb7 85 0.057%
76 A/E 84 0.0563%
77 Gsus4 82 0.055%
78 F/A 81 0.0543%
79 A9 79 0.0529%
80 C(9) 77 0.0516%
81 E9 76 0.0509%
82 Abm 75 0.0503%
83 D#m 75 0.0503%
84 D/C 72 0.0482%
85 Fm7 72 0.0482%
86 Esus 70 0.0469%
87 G/A 68 0.0456%
88 D2 68 0.0456%
89 Csus4 67 0.0449%
90 A7sus4 65 0.0436%
91 E5 65 0.0436%
92 em 64 0.0429%
93 A6 64 0.0429%
94 D/E 64 0.0429%
95 Ab7 63 0.0422%
96 Gm6 62 0.0415%
97 Am/G 59 0.0395%
98 A/D 55 0.0369%
99 G+ 55 0.0369%

Disclosures

Now, looking through the list of results, I find several potential issues with the data set used.

  1. As an amateur-transcribed list, it may be unfairly weighted to the chords that amateurs are told to learn first.
  2. The choice of chord name is not always consistent.  For example, some chord sheets write A# and some write Bb, to describe the same chord.
  3. There are some incorrectly formatted files and typos that added a few false entries into the set, but they are statistically insignificant.
  4. Some transcribers seem intent on using the German/Scandinavian naming system for notes, where there is an H chord instead of B and B is used for A#/Bb.  This is surprisingly so common in my set that it appears in the top 100 chords.  This clearly would be different for a more standardized data set.

Conclusions

Overall, I’d say the results seem pretty solid.  And essentially they say C, G, D, Am, & Em is a fairly good starting point, but you’re still gonna have to work on that damned F chord.

Why My Company is Leaving HostGator (and why you should too)

I’ve been CTO of a small start up called Prmot.it for a few years now, offering an online coupon solution to local restaurants.  We started our operation on a shared server from HostGator, and grew from there to serve thousands of customers.

We are approaching another scaling jump, and will need to expand our infrastructure significantly.  We will be moving to a new host to meet these needs.  Don’t misunderstand me.  HostGator offers many tiers of service that would serve the level of usage our company requires well into the future, but we will instead be spending a few hundred dollars more a month, and will move to a competitor.  HostGator’s service is simply unacceptable for a small business.

The problems all began a few months ago, when HostGator was acquired by EIG and their servers were moved to a new Utah data center.  Almost immediately we began seeing system-wide outages, in some cases lasting days.  We were told that these outages were unavoidable, and always another company’s fault, but they still left our customers out in the cold.  Furthermore these outages never seemed to be recorded on their up time stats, or lead to refunds, as per their supposed 99.9% up time guarantee.

All of this was bad, but it wasn’t the straw that broke the camel’s back.  That came today, with their roll out of PHP 5.4 on all their servers.

Right off the bat, this upgrade was badly mismanaged, as is obvious by its scheduling.  HostGator performed this major, potentially breaking upgrade at NOON (EST) on a Friday.  This is an American Host, with server times set in Central Time, and they performed a major upgrade in the middle of the workday, just before the weekend.

Next, they screwed up transferring PEAR package installs, thereby breaking my site, right in the middle of my lunch rush.

I spent 2 hours on chat with an obviously underqualified support technician, trying to get them to identify and fix the problem.  I narrowed it down to the call to a PEAR package, and showed them how commenting out the line would bring my site back online, but would break the coupon creation function necessary for it to properly function.

The support technician said he’d make a ticket, from this information, and give me the number, then disconnected before finishing.  Not knowing what to expect, I decided to poke around more, and discovered that I could fix the problem myself, by reinitializing PEAR and reinstalling the package in question.  I did so.

I had to go hunting for the ticket system and number, since I wasn’t given it, but I eventually found the ticket in question.  It read:

This is a support ticket for an issue escalated from a chat or phone call.
Primary Domain: prmt.it
Affected URL: http://prmt.it/admin
Description of Issue: Since PHP update, customer’s PEAR module is not working and the admin poage shows a blank screen. He commented out a line in and the site loads but it can’t create coupons without the Image/Text PEAR module
Path: www/yourls/user/plugins/templated-pages/functions/makecoupon.php
line 2: require_once ‘Image/Text.php’;
Steps to Reproduce the Issue: N/A
Additional Information: 

Regards,

Robert F.
Junior Administrator 
HostGator LLC
http://support.hostgator.com

I added the following reply:

I’ve repaired the issue. It appears that the PHP update broke the install of the Image/Text PEAR package. Re-installing the package solved the issue.

and went on with my day, assuming all was right.  I went out after work, and shut off my phone.  At 10:30pm, I turned it back on to find messages from customers and co-workers, wondering why the site was giving errors to every user.

What I discovered was truly shocking.

Without reading the issue listed in the original ticket, or reading my update to the ticket, and without updating the ticket to show that any work was done, a HostGator technician had gone into my live, customer-facing site, and commented out the include for my PEAR module.  This was not done to fix the issue in any way, but rather the line was just inexplicably commented (with a #, as well, which makes it certain that it wasn’t an accidental revert, as I only ever comment as “//”)

With this latest issue in mind, I can no longer trust that a bumbling HostGator technician will not meddle with my live code again.  I can’t trust that tickets will be documented or handled professionally, and I cannot trust that my site will be up, when it needs to be.

In short, I cannot trust HostGator, and I will not continue to subject my clients to their shoddy, unprofessional service.  If you run across anyone who is considering HostGator, I suggest you forward them this post, and let them know that HostGator is the wrong choice.

Duplicate Image Detection in PHP/MySQL – Part 1: Fingerprinting

I was looking back over my old code from 2005, and I found some of the Duplicate Image Detection (DID) code that I wrote for Gallery 2, back in 2005.  I decided to clean it up and post a simplified version for people working in PHP with the GD extension.

There are many ways to do duplicate image detection, and I have no doubt that mine is sub-par.  But it works fairly well, and is pretty easy to explain.

If you are new to the world of programmatic image manipulation, you may be thinking “why not just use the MD5 (or CRC8) hash?”  This is actually a very common implementation of DID, but has serious limitations.  In most real-world scenarios, I don’t care if the images are byte-identical.  One might be a jpg, and the other a png, but if they’re the same image, I want to locate the duplicate.   Ideally, this same logic applies to different resolutions, watermarks, crops, rotations, and even some post-processing effects.  So, instead of traditional byte-based hashes, we’ll use what is sometimes called a “perceptional hash”.

In this part, we’ll handle creating what I call a “fingerprint string”.  This is a “perceptional hash” that represents what the image looks like in a minimal way.  There are many advanced perceptional hashing libraries like Libpuzzle and pHash, but I decided to work in a PHP environment, with a simple form of the hash, based on resizing and palettizing the image itself.

Here’s how I create a “Fingerprint”:

// Takes a GD image resource and an optional hash resolution
// Returns a fingerprint string, representing this image
function fingerprint_image($img, $resolution=8) {
    //This associative array is based on my own 62 color palette.
    //Keys are HTML color codes
    //Vals are a base-62 value representing each palette entry
    $palette_array = array(
             '000000' => '0', '000b0a' => '1', '00240c' => '2', '005ab5' => '3', '006234' => '4',
             '00a8a4' => '5', '02000c' => '6', '021357' => '7', '02be28' => '8', '03112a' => '9',
             '03efa2' => 'A', '0411a4' => 'B', '04f62d' => 'C', '0becf3' => 'D', '0c12f1' => 'E',
             '10000c' => 'F', '101d01' => 'G', '1a7508' => 'H', '281603' => 'I', '2b0040' => 'J',
             '31a5fc' => 'K', '415aff' => 'L', '46d300' => 'M', '46fe37' => 'N', '48ffa1' => 'O',
             '4c00a0' => 'P', '4deff7' => 'Q', '540ff5' => 'R', '581404' => 'S', '5d6900' => 'T',
             '61023c' => 'U', '98f0fd' => 'V', '9aa3fe' => 'W', '9affc4' => 'X', 'a09400' => 'Y',
             'a11b13' => 'Z', 'a8f712' => 'a', 'afff69' => 'b', 'b138fe' => 'c', 'b500d8' => 'd',
             'c9eefd' => 'e', 'cf0f75' => 'f', 'd3ffc6' => 'g', 'dba8f3' => 'h', 'e49900' => 'i',
             'e6fffb' => 'j', 'e82810' => 'k', 'ebe7fe' => 'l', 'f49486' => 'm', 'f4f413' => 'n',
             'f8f1e2' => 'o', 'f90cde' => 'p', 'f9f468' => 'q', 'fa8f37' => 'r', 'fb49df' => 's',
             'fbf3b7' => 't', 'fbfffb' => 'u', 'fd90d8' => 'v', 'fdc5d6' => 'w', 'fdf8f7' => 'x',
             'ffe3f7' => 'y', 'ffffff' => 'z'
        );
    $max_w   = $resolution;
    $max_h   = $resolution;
    //We now create an image and load the palette with the values stored as array keys
    $palette = imagecreate($max_w, $max_h);
    foreach($palette_array as $hex_color=>$val) {
        $int_color = hexdec("0x".$hex_color);
        $color = array(
                "red"   => 0xFF & ($int_color >> 0x10),
                "green" => 0xFF & ($int_color >> 0x8),
                "blue"  => 0xFF &  $int_color
            );
        imagecolorallocate($palette, $color['red'], $color['green'], $color['blue']);
    }

    $width  = imagesx($img);
    $height = imagesy($img);
    //Now we do a proportional resize to 8x8 or less
    if ($height > $width)  {   
        $ratio   = $max_h / $height;  
        $thumb_h = $max_h;
        $thumb_w = $width * $ratio;
    } else {
        $ratio   = $max_w / $width;
        $thumb_w = $max_w;  
        $thumb_h = $height * $ratio;
    }
    $thumb = imagecreate($thumb_w, $thumb_h); 
    // secret tip, set the palette before and after filling the new image
    imagepalettecopy($thumb, $palette);
    imagecopyresized($thumb, $img, 0, 0, 0, 0, $thumb_w, $thumb_h, $width, $height);
    // set the new image's palette to my special 62 color palette
    imagepalettecopy($thumb, $palette);

    $fingerprint_array = array();
    $w = imagesx($thumb);
    $h = imagesy($thumb);
    // iterate through the new image, and get the array value associated with each value.
    // this is necessary because imagepalettecopy doesn't preserve the order of the colors in the palette
    for($j=0; $j<$h; $j++) {
        $string = "";
        for($i=0; $i<$w; $i++) {
            $color = ImageColorsForIndex($thumb, imagecolorat($thumb, $i, $j));
            $red   = dechex($color['red'  ]);
            while(strlen($red) < 2) $red = '0'.$red;
            $green = dechex($color['green']);
            while(strlen($green) < 2) $green = '0'.$green;
            $blue  = dechex($color['blue' ]);
            while(strlen($blue) < 2) $blue = '0'.$blue;
            $s = $red.$green.$blue;
            $c = $palette_array[$s];
            $string .= $c;
        }
        $fingerprint_array[] = $string;
    }
    //combine this minimal representation into a string 
    $fingerprint = implode('-', $fingerprint_array);

    return($fingerprint);
}

If you increase resolution, you decrease false positives, but you also require larger storage space, and increase false negatives.  8x8px seems to be a good sweet spot.

In the simplest use, you can store this fingerprint in your database with the image, and check for duplicate fingerprints on each upload.  Then, if you find a duplicate fingerprint, you can perform some more advanced checks to decide whether to disallow the new upload, flag it for review, link both rows to the better resolution version, or ask the user what to do.

Identical fingerprints are NOT a guarantee of duplicity, so you will need additional functions to handle collisions.

In part 2, I’ll approach the advanced uses of a fingerprint, like identifying cropped or rotated images, or simply similar, but not identical images.

Simulate an MP3 Player Display with jQuery-AnimateOverflow

IMG_0008[1]I’ve been working on progressive enhancements to a web-controlled Pandora client called Pidora.  In the process of wrangling the code, I discovered that there was really no good way to display long song titles/artists/albums.  They either take up a varying amount of screen real estate, which throws off designs, or they get cut off, effecting readability.

To solve the problem, I decided to emulate the behavior of old MP3 players, and programs like WinAmp, by limiting the title to a single line, then scrolling it across the page.  I considered using marquee tags to accomplish this, but the results would have been horrifying, and would have affected all titles, whether they needed it or not.  Instead, I created a jQuery plugin that I call AnimateOverflow.

The plugin takes any group of block level elements, converts them to single line boxes, and, if the content is larger than the box, applies an animation to sweep through the content.  Currently the plugin supports the two most common animations from old MP3 Players: linear and ping pong.

Check it out:
Demo | Download

The $30 Network-Controlled Pandora Radio

Headless Pianobar Client

Introduction

I work in a small office, with 2 other people.  We all like our music, but work very different schedules.  We wanted a device that could play music, without having to leave a computer connected to it, and could be controlled by all of us, from our desks.  We needed a wide and flexible music collection, and an easy interface.  Pandora was the perfect service, but dedicated receivers were all costly and complicated.  The obvious solution?  Build my own! Continue reading

jSlabify Now Supports Partial Pre-Slabbing

slabify

 

For those of you who don’t know, jSlabify is my jQuery plugin to create slabbed blocks of type, like the one seen above.  Until recently, there were two modes it could operate in, unslabbed and pre-slabbed.

In unslabbed mode, the plugin creates the rows of text, based on the size of the overall slab.  In pre-slabbed mode, the plugin looks for rows of text already defined with <span class=”slabbedtext”>, and then simply sizes the rows to fit.

Now there is a more flexible option, that allows the user to define sections of text to treat as a single row, and then automatically slabs the rest of the text to fit.  Simply wrap the text you want to be a single row in  the tag <span class=”slabbedtext”>, and jSlabify will do the rest.

I made a simple demo of the new partial slabbing, here.

The main demo can be found here, as always.

When using an anonymizing VPN, Check your DNS Servers!

If you are like most home broadband users, your machine is either connected directly to a broadband modem, or connected to a router that is connected to a broadband modem,  Your machine gets all of its addressing information over DHCP from the modem or router, and all is well. If you’re using a router, chances are that your router is getting its DNS settings over DHCP, from your ISP.  This means that your computer is using DNS servers that are linked to your ISP in your area.

If you start up a VPN session, ideally you receive a new set of DNS servers from the VPN endpoint, however, that is not always the case.  What can end up happening is that your machine sends a DNS query through the encrypted pipe, to your local ISP controlled DNS server.  Why is this bad?  Because, when an ISP gets a DNS request from a known VPN provider, they can simply look for a user sending traffic to that VPN’s IP address, in their local area.  Once they find that, they potentially have a one to one mapping between user and data requested.

So, how do I fix it?  If you’re lucky, your VPN provides you with DNS servers.  Use them for ALL traffic, not just encrypted VPN use.  If you aren’t so lucky, you can mitigate the issue by using a large scale DNS server that doesn’t serve a specific area, such as Google’s DNS servers (8.8.8.8 and 8.8.4.4).  These will log data, but through the VPN, they have no reasonable way of identifying you through the data.

Special Case: Proxy to VPN
If you are using my AnonyBox, or another proxy solution to connect to your VPN, your browser may be sending your DNS queries directly through your connection to your ISP, unencrypted.  This is VERY BAD.

Chrome is the only browser that handles the situation correctly by default.  As long as you are using a SOCKS v5 proxy or HTTP proxy, all DNS queries are made proxy-side.  However, if you are using a SOCKS v4 proxy, you are not safe.

In Firefox, you will need to change a setting.  Type “about:config” into the address bar, and find the line that reads “network.proxy.socks_remote_dns”.  Set it to true and restart Firefox.  For Firefox, you will want to ensure that you are using a SOCKS v5 proxy, and not an HTTP proxy.

As always, stay safe and have fun.

Fixing Unicode Support in Google Chrome

If you’re a Chrome user on Windows, you’ve likely noticed that support for international characters and unicode glyphs is pretty bad.  If you’ve ever visited a foreign page, or one that uses the full UTF8 character set, you’ve probably seen something like this:

unicode

These boxes are the “missing glyph” symbol, and represent a character that Chrome cannot render.  Firefox, on the other hand, displays them quite well.

As it turns out, the issue is not really Chrome’s fault.  When Chrome encounters a glyph that doesn’t exist in the font used to render a page, it attempts to find that glyph in a list of common fonts that the user might have available.  It’s only after exhausting that list that the missing glyph symbol is shown.

You might ask, why are there so many missing glyphs in Chrome on Windows, and why doesn’t Firefox have the same issue?  The answer to the first question is that Windows doesn’t, by default, include any of these common fonts that have good unicode support.  In fact, Windows includes NO font with good unicode support, out of box.  The second part is actually a pretty smart hack from the Mozilla team.  Firefox contains an internal glyph set to fall back on, if the system cannot display the character.

So, now that we understand the issue, how do we fix it?  Easy, install one of the font sets that Chrome knows to check for unicode glyphs, namely Code2000.

Code2000, Code2001, and Code2002 are three true-type fonts that were designed by James Kass in 2008.  They are known as a Pan-Unicode font set, designed to contain as many glyphs as possible.  They were available for free, from Kass’ website, until it went down in 2011.

I have hosted a mirror with my fbformat project, to enable unicode formatting in Chrome.  Simply download the ZIP, extract the files, and copy them into your fonts directory in control panel.  After a quick restart, Chrome will have full unicode support.

Code2000 Font Set

To test your new support, try out fbformat, and make some pretty formatted facebook statuses.

Facebook Format Tool 2.0 is here

I finally completed a long awaited overhaul of the Facebook Format tool.  The tool makes it easy to bold and italicize facebook statuses, as well as making it simple to add special characters like hearts, stars, and crosses.

My favorite new feature is the one-click chessboard.  It allows you to play chess with your friends, via facebook comment!

http://gschoppe.com/projects/fbformat