Posts

Showing posts from 2016

Hacker News Dataset October 2016

Image
Our latest project on Sizzle is a visualization of the Top 10k Posts of All Time on Hacker News.



To create the visualization, we first needed to collect the data.

I noticed that there was an old copy of the hacker news dataset available on Big Query. But I needed an up-to-date copy, so I looked into the Hacker News Firebase API.

The API allows you to get each item by Id. You can start by retrieving the current Max ID, then walking backwards from there. (Items my be stories, comments, etc., it's the same API endpoints for all types of items.)

There is no rate limit, so I created the following script that will generate a text file with 10MM lines containing all of the URIs to retrieve. (we will then feed this file into wget using xargs)

Note: 10MM items was ~5 years worth of data.

Script to create the 10MM line file of URIs to retrieve:
https://gist.github.com/aaronhoffman/1f753c660d7364bb594a36af350b227c

That script takes about 10 minutes to produce a file that's around 560MB in…

Top Cities in the United States

To find a data source for visualizations, I found myself commonly searching for "Top 200 Cities in the US", "largest cities by land area", and "largest meto area by population", etc.

I would then have to combine the list I found with other data sources to get all the info I needed.

I thought I'd try to keep an up-to-date copy of this list, in case others also found this helpful.

You can find that list here: https://gist.github.com/aaronhoffman/e1893d32fa1254429abf57f5c0413fa3

I will try to keep this list up-to-date with additional information and API keys as I use it over time.


Hope this helps,
Aaron




Reddit API OAuth C# .NET Example

Image
I had a need to connect to the Reddit API via C# .NET, and thought I'd put this little tutorial together since most I found were in java or python.


Note: General Reddit OAuth API info: https://www.reddit.com/dev/api/oauth


1. Go here and make an app: https://www.reddit.com/prefs/apps

2. Use the values from the app you created above, in this code sample:
  gist: https://gist.github.com/aaronhoffman/b59585d507601b05d8db02493eaaf73e
  more info: https://github.com/reddit/reddit/wiki/OAuth2

3. Make API requests:
  gist: https://gist.github.com/aaronhoffman/b59585d507601b05d8db02493eaaf73e


Check out the result here:





Hope this helps,
Aaron












VirtualBox Right-Click Windows Taskbar On Mac

Using VirtualBox on my Mac (in windowed mode, not full screen), I noticed that right-click on the Windows taskbar was not working. I commonly use right-click on the task bar to launch an application as administrator, pin an application to the taskbar, etc. I could two-finger click on other windows to get the right-click menu to appear, but that did not work on the taskbar.
To get the right-click menu to appear on the windows taskbar, I had to hold down the "host key" which for me was the left Mac command key, and single-finger click the item in the taskbar.

Hope this helps! Aaron

ASP.NET MVC 5 User Admin

AKA (for google-fu):
asp.net mvc 5 web site administration tool
asp.net mvc 5 web configuration tool
asp.net mvc 5 identity
asp.net mvc 5 membership

I have missed the asp.net Web Site Administration Tool that used to be included with Visual Studio. I created an alternative for asp.net mvc 4 you can find here.

For asp.net mvc 5, I created an Azure Site Extension to provide the same functionality. It can be run from VS on your local machine as well as installed as a site extension on Azure.

Links:
extension: https://www.siteextensions.net/packages/AspNetUserMaintenanceAzureSiteExtension/
github repo: https://github.com/Stonefinch/AspNetUserMaintenanceAzureSiteExtension


Hope this helps,
Aaron

Gather Metadata For Each Column of SQL Table

Image
Often when working with a new data set I'll run these queries to learn more about each column of data.

https://gist.github.com/aaronhoffman/eb30805ee2f5cafc64152dd1def800bd

For example, you can run a single query to union together aggregate data on each numeric column. Result looks something like this:



Hope this helps,
Aaron

Get Line Numbers In Exception Stack Trace

Image
If you want to ensure you get line numbers in your Exception Stack Trace, you need to make sure your project is set up to provide "full" debug info on build. By default, for "Release" configuration, this is set to "pdb-only".

Steps to enable:

In Visual Studio...

1. Ensure your project is set to "Release" Solution Configuration (or whichever config you use for deployments)

2. Go to Project > Properties > Build

3. Click the Advanced Button in the bottom right.

4. Under Output > "Debug Info", choose "full" from the dropdown.

5. Save all settings and build your project. You will now see line numbers in Exception Stack Traces.


Hope this helps,
Aaron








Write Azure WebJob Logs to SQL

We've been working with Azure WebJobs recently (more info: here and here) and even though the built in logging to Azure Storage is great, it is difficult to query at times.

In more recent version of the WebJobs SDK, the team exposed a TraceWriter collection via the JobHost configuration that allows consumers to write logs to a custom repository.

The snippet linked below is a naive implementation of using that collection to write the logs to SQL Server. Feel free to adapt it to fit your needs (i.e. performance concerns, async, batching, etc.)

Write Azure WebJob Logs to SQL:
https://gist.github.com/aaronhoffman/3e319cf519eb8bf76c8f3e4fa6f1b4ae


Hope this helps,
Aaron


Convert WGS84 to Web Mercator Pixel

Image
My google-fu wasn't the best when searching for a way to convert WGS84 to Web Mercator EPSG:3857, which is the common Google Maps and Bing Maps projection. If you have arrived here in search of an answer, the links to explain everything are below.


Essential Background
https://alastaira.wordpress.com/2011/01/23/the-google-maps-bing-maps-spherical-mercator-projection/

Take the EPSG WKT from article 1, and use it in the proj.net library
http://blogs.u2u.be/diederik/post/2010/01/01/Converting-Spatial-Coordinates-with-ProjNET.aspx


The definitive MSDN article
https://msdn.microsoft.com/en-us/library/bb259689.aspx

Code from MSDN article that doesn't use the proj.net Library
https://gist.github.com/aaronhoffman/f53e1852ca289e4a806c062f97a18f05



Live Demo (in JavaScript):
https://jsfiddle.net/aaronhoffman/L1kpdmgz/


And this next section is just to help others search and get here:

Convert WGS84 to Web Mercator
Convert WGS84 to Google Maps
Convert WGS84 to Bing Maps
Convert WGS84 to EPSG:3857

Pixel Grid Images 8x8 pixel grid on 256 pixel image

Image
Working with some images of maps, I had a need for some pixel grid overlays. I couldn't find any with a quick google search so I made these in paint.net using the Grid Maker plugin.


Images below


8x8 pixel grid on 256 pixel image




8x8 pixel grid on 1024 pixel image




16x16 pixel grid on 256 pixel image




16x16 pixel grid on 1024 pixel image



Hope this helps,
Aaron


UTM Zone WGS84 Latitude Longitude Bounding Box

We are using proj.net to calculate the UTM Easting/Northing values for a given pair of WGS84 Latitude/Longitude coordinates. But I couldn't find an easy way to determine the correct UTM Zone for a WGS84 coordinate pair. The Zone had to be provided for the library to calculate the Easting/Northing values. I wanted the library to decide the best UTM zone based on the lat/lng coordinates.

To assist with this, I created the following table containing the "bounding box" lat/lng values for each UTM Zone in North America.

Test Code and CSV file:
https://gist.github.com/aaronhoffman/699d3659e1f68991e85d05e5637e1c5a

The easiest way I found to determine these values was to use the centroid point for each UTM Zone, then add +/- 4 degrees Latitude (North/South) and +/- 3 degrees Longitude (East/West).

I have also provided some C# code to convert these into arrays, and then an example using a "point in polygon" function to determine the appropriate UTM Zone.



Hope this helps,

Azure HTTP Web Server Logs to SQL

Image
If you enable File System Web Server logs for your Azure App Service...

Settings > Diagnostics logs > Web server logging > File System





You'll start to see logs on the file system for your app service here:

(you can find your FTP host and credentials in the publish profile file)

/LogFiles/http/RawLogs




Alternatively, you can see these logs through the Kudu UI:

https://{yourappservicename}.scm.azurewebsites.net/DebugConsole/?shell=powershell



Instead of downloading these one and a time and parsing through them, you can use the following library to assist in loading them into a relational database.

You could even run this as a webjob within your app service.


Get the code here:
https://github.com/Stonefinch/AzureWebServerLogsToSql


**Update: This repo now includes a web project that references the console app as a WebJob.

You may also want to check out the LogParser tool by Microsoft https://blogs.msdn.microsoft.com/friis/2014/02/06/how-to-analyse-iis-logs-using-logparser-logparser…

Store Amazon SES Delivery Notifications in DynamoDB using SNS and Lambdas

Image
I spent way too much time configuring my Amazon AWS account to store Simple Email Service SES Delivery Notifications to a DynamoDB Table. I thought I'd put this guide together to hopefully help future developers.


This guide does not currently cover these topics, but I may include them in the future:
1. Setting up your Amazon AWS Account.
2. Setting up Amazon SES. (dkim, domain/email identities, etc.)
3. Integration with Amazon SES from your app.
4. SNS Topic creation/configuration.



Prerequisites:
1. Amazon AWS account has been created/configured.
2. Amazon SES has been created/configured.
3. Amazon SNS Topic has been created/configured.
4. Delivery Notifications for the Amazon SES Domain Identity have been configured to be sent to the SNS Topic.
5. Your app is integrated with Amazon SES and successfully sending email.



Step by Step Guide to Storing SES Delivery Notification messages into a DynamoDB Table:

So at this point, you're successfully integrated with Amazon Simple Email S…

Update Azure SQL Firewall Rule using Powershell

Image
Microsoft Azure SQL only allows connections from whitelisted IP addresses. My ISP seemed to change my external/public IP address daily. The combination of these two things was very annoying for me.

I assembled the powershell script below to make my life a little easier. It updates the firewall rule for each Azure SQL instance, and ensures they are all set to my current IP address.


1. Install and Configure Azure Powershell
https://azure.microsoft.com/en-us/documentation/articles/powershell-install-configure/

2. Execute Get-AzurePublishSettingsFile to get the publishsettings file for the subscription you're going to be working with (a browser window will open, select the profile there. Note: you can add multiple publish settings.).




3. Execute Import-AzurePublishSettingsFile and provide the path to the file downloaded from the Get-AzurePublishSettingsFile command.
4. Get your Public IP address Powershell script: https://gist.github.com/aaronhoffman/d5efc68d89995017caa4
5. Execute the…

Convert SQL Server Varbinary to Byte Array

I had a need to convert the HEX output from SQL Server Management Studio's Results to Grid output of a varbinary column back to the byte array of the file it represented. I couldn't find an all-in-one script online, so I thought I'd create this C# gist. (original parse script here)

https://gist.github.com/aaronhoffman/ad16b27d14e2f5d7e16c


Hope this helps,
Aaron

PowerShell - List File Line Count For Files in a Directory that Match a Given Pattern

I had a need to get the total count of all lines, for a given set of .csv files. For example, there are 5 files each with 1000 lines, I wanted a script that would output: 1000, 1000, 1000, 1000, 1000.

I could not find a script or command to do this, so I wrote one in PowerShell. Enjoy


List File Line Count For Files in a Directory that Match a Given Pattern

https://gist.github.com/aaronhoffman/416b88e64b970c4c9cb4


hope this helps,
Aaron