Some craigdoesdata branding images

Hi! I'm Craig Dickson.

Thanks for coming to craigdoesdata.de

I'm continually improving my Data Analysis skills using Python, SQL, Tableau and R (plus whatever else comes to hand!), and this website is where I share my completed projects, works in progress and blog posts or articles that I write. Take a look at my full Learning Timeline below to see more detail about what exactly I've been learning.

Born and raised in Edinburgh, Scotland, I have been living in Berlin, Germany since 2014, running my own English teaching company Dickson English and generally having a lovely time.

I graduated from the University of Edinburgh with a First Class MA(Hons) degree in Sociology in 2006. It was during this course that I first developed an interest in making quantitative analyses and data manipulation.



Following my graduation I worked for almost 10 years in Financial Services in Edinburgh (you can take a look at my CV or LinkedIn profile if you'd like to learn more about that), where I continued to develop my quantitative and analytical skills, before deciding to change things up and move to Germany.

I've had a wonderful time teaching, but now I'm looking for a new challenge and after being interested for years I've decided to dive right into the worlds of Data Analytics and Programming and see what I can put together. So far it's been fascinating and rewarding, and I'm excited to keep learning new things.

When I'm not spending time with my family, teaching or playing with code, I also love making music with friends.

Get in touch if you'd like to discuss possibilities for working together.

Another branding image



Learning Timeline

The dates below indicate when I seriously focused on a particular topic or technology, which is not necessarily the first time I encountered or used it. If you'd like to take a closer look at my collection of certificates, click through to my certificates page.

October 2020
Streamlit for Teams.

After posting about the Berlin Covid-19 Dashboard which I built with Streamlit so I could easily keep an eye on developments in my home district as we enter the autumn and winter, I was contacted via twitter to see if I would like to participate in the beta of their new "For Teams" product.

Obviously this made me feel very cool and I said yes. It seems a social media addiction can pay off. Who knew? The product is super impressive, you simply build an app with Streamlit (normally I wouldn't say simply, but I'm genuinely so impressed with this package and how easy it makes it), add the .py file and the requirements.txt to a public GitHub repo, and point Streamlit at it. Then boom! working web app.

Even better, to make updates you just push them to the repo and it happens automatically. So nifty! I personally am quite enjoying my journey into cloud infrastructure and getting into the details, but I know that's not the case for everyone. For data people who just want to quickly share their results, this is really worth checking out.

At present I have my Berlin Covid 19 Dashboard and my Scotland Covid 19 Dashboard running on it. It's also free for public repos, and will remain so when it moves out of beta. I promise I'm not being paid to shill for them, but I'm just so impressed. Great work Streamlit! Send me a t-shirt please!

October 2020
Serverless / Cloud hosting of Dockerised Web Apps with Amazon Web Services and Google Cloud Platform.

Continuing my survey of Cloud computing / serverless web application hosting, I deployed my Berlin Covid-19 Dashboard on GCP (which has some problems with Streamlit apps, so sadly is not the most stable of the instances I have running) and AWS (using EC2).

AWS was significantly trickier to set up, involving SSH-ing into a virtual Amazon Linux server, installing Docker on it and then leaving it to run, but it was extremely satisfying when I got it to work. I am using the free tier and I'm mildly paranoid I'm somehow going to run up a massive bill. So far so good however!

This is a really useful skill to have, it is extremely clear that the future (the present, really) of apps is in cloud microservices, so it is exciting to learn about. Also, the learning experience of going from "I'm scared - what the heck is any of this? Don't hurt me" to "Ah yeah, just let me load up an Elastic Compute Cloud server instance, quickly install some dependencies and then deploy my containerised web application" has been really enjoyable. And still so much to learn! Exciting times.

September 2020
Serverless / Cloud hosting with Heroku.

Heroku is an excellent Platform-as-a-Service company which makes it (relatively) easy to deploy web applications to the cloud and get them up and running without the hassle and expense of operating your own server.

It's a really useful service. There is something of a learning curve, but their guides are very useful and there is a lot of support available. My experience with it has been overwhelmingly positive, I suggest you take a look if you are in the market for somewhere to host your web applications.

September 2020
Web Apps with Streamlit.

Streamlit is a relatively new package focused directly at the Python data community, with the intention of making it easy to build and deploy applications to share results, models and analyses with others.

Honestly I think Streamlit could be a game-changer in this space. It allows easy creation of good-looking and interactive web applications using pure Python code, so it's very powerful. No more hours spent tinkering with HTML and CSS if all we want to do is share some results (although some of us quite enjoy spending hours tinkering!).

My first application in Streamlit was to deploy a Machine Learning model which can make predictions of the value of a house in California (in 1990) based on the parameters which the user can set. This was very easy to set up (relatively speaking), and I will definitely be using Streamlit more in future for quick prototyping and deployment of data applications.

September 2020
Web Apps with Flask.

I wanted to create a web application where I could use my Python skills to power the back end. I chose to use the Flask framework to do this.

I created a working CRUD app called Thing Finder first. You can read more about that here. That was really enjoyable and I learned a good amount doing that.

Next I created a fun text generator app called Holla if ya hear me. More info about that is here. That was also really fun. I'm now quite comfortable using Flask to build web applications in Python, and I look forward to building some more complex apps in future.

August 2020
Python and SQL Tutorial at freeCodeCamp.

I wanted to share some knowledge I had acquired recently about working with Python and MySQL together, so I wrote this piece to guide people through the process and hopefully show some of the power of combining these two languages for Data Analysis.

I was very excited to publish this one in freeCodeCamp's publication. If you're not familiar with freeCodeCamp, it's a non-profit organisation dedicated to providing coding education in the form of tutorials and courses, for free, to anyone who wants it. They have a huge range of courses available on many languages and aspects of technology. I have used and appreciated their resources, so it was great to be able to give a little bit back by authoring this article.

August 2020
Front-End Web Development

Following on from learning HTML & CSS to build this website you are reading right now earlier this year I wanted to put my new abilities to further use.

For the last five years I have been using (and paying for!) Squarespace to build the website for my English teaching company. As a challenge (and to save a not-insignificant amount of money) I decided to re-design and build the website again myself. I'm happy with how it turned out.

August 2020
Machine Learning Crash Course - Google

I have had this course from Google bookmarked for a while as I have wanted to get into ML for a while, and this is a great (and free!) option to cover the basics. Google obviously know a thing or two about ML, and TensorFlow is one of the top options out there for building models, so it was a really useful introduction.

Having now built and experimented with Deep Neural Networks I am keen to dive further in and start building models to solve real problems.

July 2020
CS50x - Harvard

I have heard great things about Harvard's intro to Computer Science course, and I decided to give it a look beginning in July. I am happy to report that the hype is justified!

I don't come from a CS background, so it's good to go through an into course. A lot of the time it's like revision of things I'm familiar with, but there are frequent "Oooooh!" moments as something slots into place or a connection is made that I hadn't thought of before.

The instructor, David Malan, is really excellent, and I have found myself being really excited for each week's material. Plus the whole course can be audited for free, with access to the cloud-based IDE and automated checking of the weekly problem sets. Good stuff!

July 2020
First series (third, fourth and fifth articles) published in Towards Data Science - MySQL Tutorial Series

As a way of wrapping up my recent dive back into SQL, I wrote a series taking the reader from conception to implentation and then to analysis of a relational database in MySQL. I again chose to publish on Towards Data Science.

This was a lot of work to put together, but I really enjoyed the process. As always, attempting to teach a topic helped me to really solidify my command of the material myself - I strongly recommend writing tutorials when you feel like you've understood something, it forces you to confirm that you really do!

This was meant to be just one post, but I had so much I wanted to write about the topic that it grew to three:

July 2020
Google Analytics

I have used Google Analytics to monitor traffic and activity on my teaching business website, but I wanted to get to know more about it, as it is an essential tool for all larger businesses with an online footprint.

I took several courses at the Google Analytics Academy, and capped it off by passing the Google Analytics Individual Qualification. Now I feel even more confident in my understanding of the platform.

June / July 2020
SQL

I took the time to refamiliarise myself with and dive much more deeply into creating and updating databases using SQL. I used SQL fairly extensively for query-writing back in my banking days, and I wanted to work on my database design and building skills. I really enjoyed the freeCodeCamp course (focused on MySQL), and I also took several DataCamp courses which helped deepen my understanding and expose me to PostgreSQL as well.

I had used SQL queries in Python code previously, but I wanted to get into designing a schema, building and populating a database from scratch and so on, as that’s a very important part of any real-world data application, and I just found it really interesting.

June 2020
Tableau

I had some experience with Tableau and other BI software from my financial services days, but I wanted to expand my knowledge and see what new features were available.

I spent a lot of time going through their elearning paths and working in Tableau, making visualisations and dashboards. I was impressed with how quickly it’s possible to put together something that can engagingly convey a lot of interactive data to the people in an organisation who need that information. I can see why Tableau is widely used by enterprises, although the nerdiest part of me still really enjoys the process of typing out raw Python code.

I could easily see a workflow of performing data cleaning & analysis using the PyData stack, then using Tableau to quickly and easily visualise the information and share it with stakeholders being very effective.

I picked up a few qualifications from the elearning platform too – I am a verified Tableau Data Scientist among other things.

May 2020
Third article published in Towards Data Science - Creating an interactive map in Python using Bokeh and pandas

Continuing my theme of giving back to the Python code community, I wrote up what I had learned about using Bokeh to create an interactive visualisation as a tutorial, and was delighted that it too was accepted and published on Towards Data Science. I received some really positive feedback on this one - people seemed to enjoy it and find it useful - which I found really gratifying.

May 2020
Bokeh

After seeing some slick visualisations created by twitter friends, I was inspired to pick up Bokeh and see if I could make a nice interactive visualisation myself. I used my avocado dataset again so I could dive straight into the visualisation part, and I was quite happy with the results.

May 2020
HTML5 & CSS

I decided that if I wanted to show people my work I would need to establish a portfolio website. I had previously made a very nice-looking website for my English teaching company using Squarespace, which was easy and looked great, but was very expensive. I was inspired by my twitter community and seeing the cool things people on the #100DaysOfCode hashtag were doing to think that I could probably give that a go myself.

I took a course on Udemy to get the basics of using HTML & CSS, and then set to work building this very website you are enjoying right now. I’m by no means a professional front-end dev, but I’m pretty proud of what I have been able to put together. I enjoyed the immediate feedback to little (and large) changes you can get when working on the front end just by looking in the browser. This is definitely something I’d like to develop further on the side – I already have an intro to Javascript course lined up for some point in the future.

April 2020
Bar Chart Race

I made my own attempt at a bar chart race using Matplotlib’s animation api and the gif library in Python. I wanted to learn how to make animated visualisations using Matplotlib, and this was a good use case for me to learn how that works.

I also used the same dataset with the then-just-released bar_chart_race Python package, which does a similar thing but in a more pre-packaged (and significantly easier for the end user) way.

April 2020
Second article published in Towards Data Science - Where to swim in Berlin!

I wanted to work with some dirtier real-world datasets, so I explored the data available via Open Data Berlin. I found some interesting (to me at least) data about Badestellen (bathing spots) in the city and did some visual EDA using pandas to find out where to move if you want to have the most outdoor swimming options (surely the critical metric to consider when choosing a new place to live).

I was very pleased that the tutorial / article that I wrote about it was also accepted and published in Towards Data Science. This was positioned more as a ‘beginner’s guide’, trying to give people just getting to grips with pandas some exposure to its power and some ideas about how to use it to join data sets and extract insights from them, and to direct people to the best resources I had used in my own learning journey to that point.

March 2020
First article published in Towards Data Science - Mapping Avocado Prices in Python with GeoPandas, GeoPy and Matplotlib

This was the ‘fruit’ (see what I did there?) of my own work on personal projects where I was really putting pandas to work, using Jupyter Notebooks and a variety of Python libraries. I set myself a challenge, found a question that I wanted to answer (and in this case, a visualisation that I wanted to produce) and kept going until I got there. There were ups and downs (scroll through my twitter timeline in February and March 2020 if you want to follow that emotional rollercoaster!) but with help from my analytics community on Twitter, along with countless articles on Towards Data Science and answers on Stack Overflow, I made it to my goal.

I’m a firm believer in the idea of paying things forward - something I’ve found really inspiring about the data community I’ve met as I’ve been on my coding and data analytics journey is how generous people are with their time and expertise. I wanted to make my own contributions to the community to leave something to help other people who might face similar challenges, so I wrote my first article providing a how-to to help future generations of analysts learn from my experiences.

After reading so many articles on Towards Data Science I was absolutely delighted when they agreed to publish my own piece.

January 2020
Python

In January I decided to dive into Data Analysis and really go for it in terms of teaching myself. I surveyed the landscape and chose learning Python (focusing on the PyData ecosystem) as my point of entry. I signed up for a year’s subscription to DataCamp and immediately started working my way through their ‘Data Scientist with Python’ program.

I had played with Python before, but this was my first really serious attempt, and my first time working with libraries that I would grow to love like pandas and Matplotlib. As I progressed through the courses it became clear to me that taking tutorials would not be enough, but working on my own projects was the best way for me to really get to grips with the tools I was learning.

I have learned a huge amount since then, and I still have a huge amount to learn about using Python for Data Analysis. That’s what I find really exciting about it!

January 2020
Scrum / Agile

At the end of 2019 I became interested in learning more about Scrum and other Agile methodologies, after an in-depth discussion with a friend back in Scotland who is working as an Agile Coach, and having worked in my capacity as an English teacher in a number of companies employing variations of this methodology.

It’s plain to see the benefits which an Agile approach offers companies, especially tech and software companies where the iterative methodology is a natural fit, but also other types of product and service. I started using some of the tools in my own work and even in my personal life (when my family & I moved flat in March 2020 I set up a Kanban board that really helped us manage that whole project. I felt very cool!). I decided to formalise my knowledge by taking the Professional Scrum Master I certification.

February 2019
Grundqualifizierung für die Unterrichtsarbeit mit Erwachsenen im Fachbereich Allgemeine Erwachsenenbildung - Berliner Senatverwaltung für Bildung, Jugend und Familie

The "Basic qualification for teaching work with adults in the field of general adult education" is a German-language qualification offered by the Berliner Senat, Humboldt University and the Berliner Volkshochschule as part of their effort to professionalise the workforce of teachers working with the Volkshochschule. To do this I took a series of different courses all themed around pedagogy and teaching adults, including auditing a semester at Humboldt University.

This was challenging to do in German - my second language - but I found it really enjoyable to be pushed in that way, and the variety of courses was really enjoyable. I learned a lot working through the various courses included here, and met a lot of other professionals and gained a lot through the Erfahrungsaustausch (exchange of experiences). It was also great to do a semester at Humboldt University!

January 2018
Deutsch-Test für Zuwanderer (DTZ) - Bundesamt für Migration und Flüchtlinge

Living and working in Germany since 2014, my German language skills have become pretty good (although there is still a lot of room for improvement!).

Without wishing to confirm any cultural prejudices regarding Germany, I will simply say that it is often helpful here to have a certificate to confirm that you can do the things you claim to be able to do. With that in mind I signed up for and passed the DTZ (German Test for Immigrants). I received 99% in the speaking section (I’d dearly like to know where I dropped that 1%!), confirming that my German language skills were above B1 level back then.

November 2014
CELTA – Cambridge English

For a wide variety of reasons (ask me when you interview me!) I changed career in 2014, leaving Financial Services behind to move to Germany and become an English Teacher. To do that I chose to take the most widely recognised qualification for EFL teachers – the Certificate in Teaching English to Speakers of Other Languages (CELTA). This was an intensive course of study which required facing lots of new challenges and throwing myself into learning how to teach as well as discovering and clearing up gaps in my own knowledge of the English language.

This set me up very well for my successful career as an EFL trainer, where I have also had the opportunity to massively improve my communication and speaking skills.

2010 - 2012
Investment & Finance

I wanted to further my knowledge and gain some professional certifications to back up my practical knowledge, so I took a series of courses and exams focused on investment management (my professional field from 2006 - 2014).

I started with the Investment Operations Certificate from the CISI, then moved on to the Investment Management Certificate from the CFA Society of the UK, and then started working towards becoming a Chartered Financial Analyst (CFA). I completed Level 1 of the 3 CFA levels, but due to moving to a new company which didn’t support the CFA program, and then changing career entirely in 2014, I didn’t complete levels 2 & 3. I learned a huge amount about financial analysis in the course of completing my Level 1 certification however.

July 2006
MA(Hons) First Class – Sociology, University of Edinburgh

In 2006 (a long time ago now!) I graduated from my four-year degree course at the University of Edinburgh with a First Class Honours Degree in Sociology. I learned a huge amount during those years, including my first forays into quantitative analysis. Ah, SPSS, my old friend!

After my MA I decided not to continue in academia, and moved into industry. I found work in financial services (more details on my CV or LinkedIn profile), where I spent time building fair value pricing models, ensuring pricing data quality and managing treasury data, among a wide range of other things.