Posts

Hacker News Dataset October 2016

Image
Our latest project on Sizzle  is a visualization of the Top 10k Posts of All Time  on Hacker News . To create the visualization, we first needed to collect the data. I noticed that there was an old copy of the hacker news dataset available on Big Query . But I needed an up-to-date copy, so I looked into the Hacker News Firebase API . The API allows you to get each item by Id. You can start by retrieving the current Max ID, then walking backwards from there. (Items my be stories, comments, etc., it's the same API endpoints for all types of items.) There is no rate limit, so I created the following script that will generate a text file with 10MM lines containing all of the URIs to retrieve. (we will then feed this file into wget using xargs) Note: 10MM items was ~5 years worth of data. Script to create the 10MM line file of URIs to retrieve: https://gist.github.com/aaronhoffman/1f753c660d7364bb594a36af350b227c That script takes about 10 minutes to produce a file t

Top Cities in the United States

To find a data source for visualizations , I found myself commonly searching for "Top 200 Cities in the US", "largest cities by land area", and "largest meto area by population", etc. I would then have to combine the list I found with other data sources to get all the info I needed. I thought I'd try to keep an up-to-date copy of this list, in case others also found this helpful. You can find that list here:  https://gist.github.com/aaronhoffman/e1893d32fa1254429abf57f5c0413fa3 I will try to keep this list up-to-date with additional information and API keys as I use it over time. Hope this helps, Aaron

Reddit API OAuth C# .NET Example

Image
I had a need to connect to the Reddit API via C# .NET, and thought I'd put this little tutorial together since most I found were in java or python. Note: General Reddit OAuth API info:  https://www.reddit.com/dev/api/oauth 1. Go here and make an app:  https://www.reddit.com/prefs/apps 2. Use the values from the app you created above, in this code sample:   gist:  https://gist.github.com/aaronhoffman/b59585d507601b05d8db02493eaaf73e   more info:  https://github.com/reddit/reddit/wiki/OAuth2 3. Make API requests:   gist:  https://gist.github.com/aaronhoffman/b59585d507601b05d8db02493eaaf73e Check out the result here : interactive data visualization Hope this helps, Aaron

VirtualBox Right-Click Windows Taskbar On Mac

Using VirtualBox on my Mac (in windowed mode, not full screen), I noticed that right-click on the Windows taskbar was not working. I commonly use right-click on the task bar to launch an application as administrator, pin an application to the taskbar, etc. I could two-finger click on other windows to get the right-click menu to appear, but that did not work on the taskbar. To get the right-click menu to appear on the windows taskbar, I had to hold down the "host key" which for me was the left Mac command key, and single-finger click the item in the taskbar. Hope this helps! Aaron

ASP.NET MVC 5 User Admin

AKA (for google-fu): asp.net mvc 5 web site administration tool asp.net mvc 5 web configuration tool asp.net mvc 5 identity asp.net mvc 5 membership I have missed the asp.net Web Site Administration Tool that used to be included with Visual Studio. I created an alternative for asp.net mvc 4 you can find here . For asp.net mvc 5, I created an  Azure Site Extension  to provide the same functionality. It can be run from VS on your local machine as well as installed as a site extension on Azure. Links: extension:  https://www.siteextensions.net/packages/AspNetUserMaintenanceAzureSiteExtension/ github repo:  https://github.com/Stonefinch/AspNetUserMaintenanceAzureSiteExtension Hope this helps, Aaron UPDATE These are now published to nuget.org  https://www.nuget.org/packages/AspNetUserMaintenanceAzureSiteExtension/

Gather Metadata For Each Column of SQL Table

Image
Often when working with a new data set I'll run these queries to learn more about each column of data. https://gist.github.com/aaronhoffman/eb30805ee2f5cafc64152dd1def800bd For example, you can run a single query to union together aggregate data on each numeric column. Result looks something like this: Hope this helps, Aaron

Get Line Numbers In Exception Stack Trace

Image
If you want to ensure you get line numbers in your Exception Stack Trace, you need to make sure your project is set up to provide "full" debug info on build. By default, for "Release" configuration, this is set to "pdb-only". Steps to enable: In Visual Studio... 1. Ensure your project is set to "Release" Solution Configuration (or whichever config you use for deployments) 2. Go to Project > Properties > Build 3. Click the Advanced Button in the bottom right. 4. Under Output > "Debug Info", choose "full" from the dropdown. 5. Save all settings and build your project. You will now see line numbers in Exception Stack Traces. Hope this helps, Aaron

Write Azure WebJob Logs to SQL

We've been working with Azure WebJobs recently (more info: here and here ) and even though the built in logging to Azure Storage is great, it is difficult to query at times. In more recent version of the WebJobs SDK, the team exposed a TraceWriter collection via the JobHost configuration that allows consumers to write logs to a custom repository. The snippet linked below is a naive implementation of using that collection to write the logs to SQL Server. Feel free to adapt it to fit your needs (i.e. performance concerns, async, batching, etc.) Write Azure WebJob Logs to SQL: https://gist.github.com/aaronhoffman/3e319cf519eb8bf76c8f3e4fa6f1b4ae Hope this helps, Aaron

Convert WGS84 to Web Mercator Pixel

Image
My google-fu wasn't the best when searching for a way to convert WGS84 to Web Mercator EPSG:3857, which is the common Google Maps and Bing Maps projection. If you have arrived here in search of an answer, the links to explain everything are below. Essential Background https://alastaira.wordpress.com/2011/01/23/the-google-maps-bing-maps-spherical-mercator-projection/ Take the EPSG WKT from article 1, and use it in the proj.net library http://blogs.u2u.be/diederik/post/2010/01/01/Converting-Spatial-Coordinates-with-ProjNET.aspx The definitive MSDN article https://msdn.microsoft.com/en-us/library/bb259689.aspx Code from MSDN article that doesn't use the proj.net Library https://gist.github.com/aaronhoffman/f53e1852ca289e4a806c062f97a18f05 Live Demo (in JavaScript): https://jsfiddle.net/aaronhoffman/L1kpdmgz/ And this next section is just to help others search and get here: Convert WGS84 to Web Mercator Convert WGS84 to Google Maps Convert WGS84 to Bing M

Pixel Grid Images 8x8 pixel grid on 256 pixel image

Image
Working with some images of maps, I had a need for some pixel grid overlays. I couldn't find any with a quick google search so I made these in paint.net using the Grid Maker  plugin. Images below 8x8 pixel grid on 256 pixel image 8x8 pixel grid on 1024 pixel image 16x16 pixel grid on 256 pixel image 16x16 pixel grid on 1024 pixel image Hope this helps, Aaron

UTM Zone WGS84 Latitude Longitude Bounding Box

We are using proj.net  to calculate the UTM Easting/Northing values for a given pair of WGS84 Latitude/Longitude coordinates. But I couldn't find an easy way to determine the correct UTM Zone for a WGS84 coordinate pair. The Zone had to be provided for the library to calculate the Easting/Northing values. I wanted the library to decide the best UTM zone based on the lat/lng coordinates. To assist with this, I created the following table containing the "bounding box" lat/lng values for each UTM Zone in North America. Test Code and CSV file: https://gist.github.com/aaronhoffman/699d3659e1f68991e85d05e5637e1c5a The easiest way I found to determine these values was to use the centroid point for each UTM Zone, then add +/- 4 degrees Latitude (North/South) and +/- 3 degrees Longitude (East/West). I have also provided some C# code to convert these into arrays, and then an example using a " point in polygon " function to determine the appropriate UTM Zone.

Azure HTTP Web Server Logs to SQL

Image
If you enable File System Web Server logs for your Azure App Service... Settings > Diagnostics logs > Web server logging > File System You'll start to see logs on the file system for your app service here: (you can find your FTP host and credentials in the publish profile file) /LogFiles/http/RawLogs Alternatively, you can see these logs through the Kudu UI: https://{yourappservicename}.scm.azurewebsites.net/DebugConsole/?shell=powershell Instead of downloading these one and a time and parsing through them, you can use the following library to assist in loading them into a relational database. You could even run this as a webjob within your app service. Get the code here: https://github.com/Stonefinch/AzureWebServerLogsToSql **Update: This repo now includes a web project that references the console app as a WebJob. You may also want to check out the LogParser tool by Microsoft https://blogs.msdn.microsoft.com/friis/2014/02/06/how-to-