Within the past decade, advances in data-driven technologies have fundamentally transformed the corporate landscape. In the past, most business problems could be tackled in Microsoft Excel - today this is no longer the case. There is simply too much data which is only accessible via programming, be it querying SQL databases or key-value stores, fetching data from APIs, scraping data from websites or downloading files from web servers. Transforming the data increasingly requires programming as well, such as parsing text files or applying complex analytical functions. Sometimes the size of data alone is the limiting factor - Excel for instance supports no more than 1,048,576 rows. As time goes on, programming will be the only way to adequately manage and analyze all of this data.
Despite this structural shift in how businesses approach problems, many firms - especially large, established ones - have been slow to extricate themselves from their dependence on Excel. Barring rare exceptions, there is little motivation to train analysts with a 21st-century skill set.
But while there is no direct cost of increasingly obsolete analytical skills, there is an enormous opportunity cost. Generalist analysts (such as investment banking analysts, management consulting analysts, business analysts, etc.) spend hundreds of hours manually traversing and copying data from flat files or PDFs, updating Excel workbooks and aggregating data found across remote corners of the Internet. In short, these are highly manual, highly automatable tasks. Business insights are overlooked when analysts allocate more time to manual work than analytical work. Even worse, analysts are prone to committing costly mistakes due to the sheer volume of manual labor.
The ability to program therefore offers a sizable multiplier on analyst productivity. After all, these are the same savvy analysts now technically equipped to automate otherwise tedious and redundant work. Employers need not (and should not) hire programmers to fill this role - rather, they must train their business-minded graduates to program. They must combine the business analyst skill set with the data analyst skill set. They must teach their analysts to code.
Before diving into the exact programming skills required for today’s data analyst, it’s important to understand conceptually why programming is so important. Using programming and data languages offers the following benefits:
While these benefits primarily concern data analysis, there is an added bonus: programming facilitates an understanding of non-analytical fields in data science, such as data engineering or webapp development. This “vertical integration” along the data science stack is non-trivial - it means you are able to manage your data from start to finish[].
Data analysts should be able to tackle any business problem which involves, perhaps unsurprisingly, both (1) data and (2) analysis. For example, if data is stored in a database, then a programming language alone is insufficient to solve the business problem - a data language is required. Therefore, proficiency in these technologies (ordered from most to least commonly used) should cover the vast majority business problems faced by analysts:
I hope this post clarified the necessity of programming in solving business problems. Business analysts will always be on the front lines when it comes to working with data, and as a result they should be equipped with the right tools to handle the data.
These data analysis skills need not replace their business acumen - on the contrary, learning to code will only enhance it.