What I do

  • Web scraping: Data extraction from HTML or JavaScript-driven websites, including those requiring authentication or form submission and those using pagination.
  • File scraping: Data extraction from plain-text files or Excel files. Often, datasets must be combined when they are stored within multiple files (eg. an Excel workbook or text file for each year of data).
  • Exploratory data analysis & data validation: What is in the data? Is there missing data? Are there missing files? Are there outliers? Null values? Typos? Do values or dates need to be extracted from text?
  • Data cleaning: Remediating data integrity issues identified during exploratory data analysis: correcting typos, grouping similar text entities, using regular expressions to extract dates or numbers from a column, filling or dropping null data, among others.
  • Analytics: Evaluating trends over time, characteristics by segment, identifying correlations between certain variables or developing regression models.
  • Visualization: Charting analyses or variables, originally in tabular form, to quickly identify and emphasize patterns within the data.

* I use mostly Python and pandas.

What I don't do

  • Machine learning
  • Database management
  • Server administration
  • Website development
  • Interactive data visualization (ie. d3.js)
  • Work that can't be automated (eg. inconsistent data)

Examples of work

  • You have dozens of Excel files with some important data located among mulitple tabs. You need to extract this information into a single file in order to do a multi-year analysis.
  • You have dozens of fixed-width delimited text files. You need to extract certain information, isolate dates within a text field (eg. "The client delivered this on 04/12/2014"), and output certain KPIs over time.
  • You need to scrape data from an HTML website after authenticating.
  • Which groups are more vulnerable to certain risks than others?
  • Break out the income bracket deciles, then correlate brackets with certain factors

Rate, budgeting and scheduling

  • My hourly rate is $75/hr.
  • I don't accept projects that I estimate will take over 20 hours.
  • Together we'll determine a detailed schedule and budget for completed work.
  • Any project-associated fees (eg. travel, data purchase) are incurred by the client.

Deliverables

  • Project overview and scope
  • Budget estimate, schedule & time sheets
  • Work product (and preliminary output)
  • Associated documentation

Testimonials & ratings

  • "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."

Budget score

  • For each project, I score to what degree we went underbudget.
  • To date, my score is 80% of budget.