The final (until I add a sponsors table) db structure for the legislation. Works with the existing import script, with added indexes for easier querying.
Author: George Stephanis
-
PHP Legislation Parser + Importer
What I’ve got so far. More to come, but wanted to get this out there.
-
Legislation DB Dump
Still not quite the final DB structure I’d like, but this is available for data mining and trying to build something awesome out of.
-
OpenDataDay Hackathon DC!
So I went down to DC this weekend to participate in the Open Data Day Hackathon! There were some tremendous projects proposed, but the one that caught my eye from the start of the day was one proposed by Jim Harper of the Cato Institute to track down the genealogy of legislation put forth in congress.
Basically, the goal is to programatically find similar passages in multiple bills. This can be used for many purposes, including looking at sections in large omnibus bills and getting an idea if the things that get shoehorned in it have been proposed previously, and what happened then.
So, our team largely consisted of myself, Alexander Furnas of the Sunlight Foundation, and John Bloch of 10up, with guidance from Jim Harper (previously mentioned, of the Cato Institute), Molly Bohmer (also of the Cato Institute), and Kirsten Gullickson providing some clarification on the way the XML data we were working with was structured.
I spent my time building a MySQL database and a PHP import script that could map all the relevant data from the XML files in to it.
Alexander worked in Python primarily fleshing out a way of doing Latent Semantic Analysis on the data we’ve extracted to sort out what is similar to what, and where can we find meaning in it.
John spent his time working on a front-end for the final dataset, to help end-users get something useful out of the data we’re building.
The data that we were pulling from can be readily accessed by anyone through the Library of Congress at the following URLs:
- http://thomas.loc.gov/home/gpoxmlc108/ [download all xmls] (6.2 mb tar & gzip)
- http://thomas.loc.gov/home/gpoxmlc109/ [download all xmls] (32 mb tar & gzip)
- http://thomas.loc.gov/home/gpoxmlc110/ [download all xmls] (69 mb tar & gzip)
- http://thomas.loc.gov/home/gpoxmlc111/ [download all xmls] (110 mb tar & gzip)
- http://thomas.loc.gov/home/gpoxmlc112/ [download all xmls] (91 mb tar & gzip)
- http://thomas.loc.gov/home/gpoxmlc113/
I’m currently putting some finishing touches on the DB structure, but when that’s done, I’ll be releasing that and the import script in a subsequent post, as well as a SQL dump for the final accumulated and sorted data — ripe for data mining. As the day was wrapping up, I had someone come to me inquiring about data mining for references to money allocated in appropriations bills and the like, and I was able to very quickly do a MySQL query along the lines of
SELECT * FROM `resolution_text` WHERE `text` LIKE '$%'
to find anything that started with a dollar sign and then listed an amount over a very limited data set of three million rows or such. The final data set will be much larger.
- http://thomas.loc.gov/home/gpoxmlc108/ [download all xmls] (6.2 mb tar & gzip)
-
Interview featured on the Optimizely Blog
I was featured in an interview with Cara Harshman for the Optimizely Blog in 2013.
While the original post isn’t live any longer, Archive.org comes to the rescue. It can be seen here:
Full article -
Jonathan Coulton, Baby Got Back, Glee, and Copyright
Disclaimer! I am not a lawyer! These are just my musings, if you ARE a lawyer, I’d love to hear back from you as to whether I’m on track. Also, I call myself a Code Monkey. That’s also a song by JoCo. It’s awesome, and you should listen to it.
If you’re here, I’m going to assume you’ve heard some details on the current situation of Glee ripping off Jonathan Coulton’s cover of Baby Got Back. If not, read JoCo’ summary first.
My understanding of the general consensus is that as the “cover” is a licensed cover, he doesn’t have any specific rights to protect it from Glee using it.
The musical arrangement that the covered lyrics were set to was 100% original, and JoCo released a Karaoke track that omits all of the covered lyrics.
It is my contention that the Karaoke track is not a cover, and is instead a wholly original work, and as such, JoCo owns rights to the melody to which his cover was set.
Let me rephrase it another way:
If I write a little tune that I find to be catchy, and release it, I would own the rights to it. If, later, I purchased the rights to cover a song, and put the lyrics of the song to my completely unrelated tune, would I still have rights to my original tune? Or would the fact that I happened to combine the two rob me of the rights to my original tonal creation?
If you believe I would lose my rights, then I licensed my tune non-commercial Creative Commons (as JoCo did) and a third party took it and did a non-commercial cover version of a different song to said tune, would that then rob me of by rights to the tune? The actions of a unrelated third party licensing it can rob the original rights-holder of his rights to the licensed tune?
If you have a different answer to each of the last two questions, I’ve gotta ask why. Because, for me, both of them seem to be a firm “Yes, I should keep the rights to the tune”
In fact, that is why the law reads:
A compulsory license includes the privilege of making a musical arrangement of the work to the extent necessary to conform it to the style or manner of interpretation of the performance involved, but the arrangement shall not change the basic melody or fundamental character of the work, and shall not be subject to protection as a derivative work under this title, except with the express consent of the copyright owner.
As such, I question whether the portion of JoCo’s Baby Got Back that was a wholly new melody (that was ripped off by Glee) would suffer the same shackling to the original rights holder, when I would consider that melody to not be a derivative work, and the ‘cover’ to in fact be a derivative work (as it has a wholly new melody).
The law says that it can’t be a derivative work if it keeps the original basic melody. JoCo didn’t. So — derivative work?
-
The best phrased response to the current GPL spat between WordCamps and Envato
I will preface my comments by saying that I disagree completely with the approach the WordPress Foundation is taking here. The problem is a disagreement between the WPF and Envato, and developers are merely caught in the crossfire.
This approach makes developers choose between putting food on the table and being a persona non grata to the WPF, or else risking their legitimate revenue stream, and be in the WPF’s good graces. Unfortunately, for Jake and thousands of developers like him, the WPF’s good graces don’t put food on the table.
And while the tactic may ultimately work, there are only so many times you can turn the 50-mm barrels on the rank-and-file in the community itself, and not have adverse affects.
That said, I take issue with Envato’s stance, as well:
To my mind, it doesn’t make sense that a regular license sold on ThemeForest should give such a buyer the right to on-sell a creator’s work at that volume – if only for the simple reason that volume reselling can significantly reduce demand for the original work.
You are arbitrarily restricting the ability of your marketplace suppliers to offer their work under the license of their choice. The way I read this, your real concern is that Envato would lose commissions if Themes in their marketplace were offered as 100% GPL, and led to downstream distribution. If that is the real concern, it may or may not be valid, but it is disingenuous to couch such concern as concern for your marketplace sellers.
If that is *not* the real concern, then I don’t see how any real concern exists. Just let your marketplace sellers *choose* to offer their works under 100% GPL. Put up huge banners decrying the risks of doing so. Strongly suggest that they don’t do so. Rail against the GPL all you want. Make them click through 3 “are you sure?” dialogue boxes.
But offer the choice.
I guarantee you that the WordPress Theme developers who opt-in to offering their works under a 100% GPL license do so under full understanding of the license terms, and either disagree with your risk assessment, or have evaluated the risk-reward differently. You don’t need to “protect” them from the license.
Just offer them the choice.
This. A thousand times this.
-
Taxonomy List Page
Have you ever wanted to have the archive page for a custom taxonomy be a listing of the taxonomies? Easy! Just create a Page and have the slug match the base slug for the custom taxonomy! Then on that page, you could just create the index by hand, or use THIS shortcode!
In short, if you just call it as
[taxonomy-list taxonomy="my-taxonomy"]it will pull in a list of all of the taxonomies available! You can tweak some other options, but that’s the gist.The bulk of the code, however, is to let you pull in an image for each taxonomy item!
It does this by grabbing the first post (or other post type — you can specify this via the post_type argument) in that taxonomy that has a featured image, and using that as the featured image for the category! Easy peasy!
Just use that method via
[taxonomy-list taxonomy="my-taxonomy" images="yup"]— or anything else that doesn’t evaluate to empty.Also threw in some caching for good measure. It shoves the result in a transient for an hour.
There’s probably some other bits in here worth fleshing out — proper usage for title_li and depth, when images is turned on, etc etc, but for now, it serves my needs, and hopefully some of yours as well!
add_shortcode( 'taxonomy-list', 'my_taxonomy_list' ); function my_taxonomy_list( $atts ) { $args = shortcode_atts( array( 'taxonomy' => 'category', 'post_type' => 'post', 'title_li' => '', 'depth' => 1, 'hide_empty' => 1, 'images' => 0, ), $atts ); $get_posts_args = array( 'post_type' => $args['post_type'], 'number posts' => 1, 'meta_query' => array( array( 'key' => '_thumbnail_id', 'compare' => 'EXISTS', ), ), ); $key = 'my_taxonomy_list-' . md5( serialize( $args ) ); if ( false === ( $result = get_transient( $key ) ) ) { ob_start(); ?> <ul class="taxonomy-list taxonomy-<?php echo $args['taxonomy']; ?>-list" data-taxonomy="<?php echo $args['taxonomy']; ?>"> <?php if ( empty( $args['images'] ) ) { wp_list_categories( $args ); } else { $cats = get_categories( $args ); if( empty( $cats ) ) break; foreach( $cats as $cat ) { $img = ''; $get_posts_args[$args['taxonomy']] = $cat->slug; if ( $posts = get_posts( $get_posts_args ) ) { $img = get_the_post_thumbnail( $posts[0]->ID ); } ?> <li><a href="<?php echo get_term_link( $cat ); ?>"> <?php echo $img; ?> <?php echo $cat->name; ?> </a></li> <?php } } ?> </ul> <?php $result = ob_get_clean(); set_transient( $key, $result, HOUR_IN_SECONDS ); } return $result; } -
Draw Something Cool
So I’ve had a lot of awesome feedback for the “Draw Something Cool” bit that I’ve added to my Contact Form. It’s in actuality just the Signature add-on for GravityForms!
That being said, here are some of the best images that I’ve had people submit through it thus far:











-
Magento Duplicate Orders MySQL Search
A handy MySQL query to check a Magento DB for potential duplicate orders!
SELECT `quote_id`, COUNT(`quote_id`) AS `qty_duplicates`, `increment_id` AS `first_increment_id`, GROUP_CONCAT( `increment_id` SEPARATOR ' | ' ) AS `increment_ids`, `created_at`, `state`, `status`, `customer_firstname`, `customer_lastname`, `customer_email`, `grand_total` FROM `sales_flat_order` GROUP BY `quote_id` HAVING COUNT(`quote_id`) > 1 ORDER BY `created_at` ASC
