I spent the last part of 2020 working on some exciting projects that should bear fruit in 2021, but until they do I have to be uncharacteristically coy about them I'm afraid. However, working with them I have had the opportunity to become much more involved using Amazon Web Services. I decided to take the chance to formalise my new knowledge and took the AWS Certified Cloud Practitioner exam.
I enjoyed studying for this one and learning about the huge array of services offered by AWS, as well as using them practically for the projects I'm working on. Next up is the AWS Solutions Architect Associate!
After posting about the Berlin Covid-19 Dashboard which I built with Streamlit so I could easily keep an eye on developments in my home district as we enter the autumn and winter, I was contacted via twitter to see if I would like to participate in the beta of their new "For Teams" product.
Obviously this made me feel very cool and I said yes. It seems a social media addiction can pay off. Who knew? The product is super impressive, you simply build an app with Streamlit (normally I wouldn't say simply, but I'm genuinely so impressed with this package and how easy it makes it), add the .py file and the requirements.txt to a public GitHub repo, and point Streamlit at it. Then boom! working web app.
Even better, to make updates you just push them to the repo and it happens automatically. So nifty! I personally am quite enjoying my journey into cloud infrastructure and getting into the details, but I know that's not the case for everyone. For data people who just want to quickly share their results, this is really worth checking out.
At present I have my Berlin Covid 19 Dashboard and my Scotland Covid 19 Dashboard running on it. It's also free for public repos, and will remain so when it moves out of beta. I promise I'm not being paid to shill for them, but I'm just so impressed. Great work Streamlit! Send me a t-shirt please!
Serverless / Cloud hosting of Dockerised Web Apps with Amazon Web Services and Google Cloud Platform.
Continuing my survey of Cloud computing / serverless web application hosting, I deployed my Berlin Covid-19 Dashboard on GCP (which has some problems with Streamlit apps, so sadly is not the most stable of the instances I have running) and AWS (using EC2).
AWS was significantly trickier to set up, involving SSH-ing into a virtual Amazon Linux server, installing Docker on it and then leaving it to run, but it was extremely satisfying when I got it to work. I am using the free tier and I'm mildly paranoid I'm somehow going to run up a massive bill. So far so good however!
This is a really useful skill to have, it is extremely clear that the future (the present, really) of apps is in cloud microservices, so it is exciting to learn about. Also, the learning experience of going from "I'm scared - what the heck is any of this? Don't hurt me" to "Ah yeah, just let me load up an Elastic Compute Cloud server instance, quickly install some dependencies and then deploy my containerised web application" has been really enjoyable. And still so much to learn! Exciting times.
Serverless / Cloud hosting with Heroku.
Heroku is an excellent Platform-as-a-Service company which makes it (relatively) easy to deploy web applications to the cloud and get them up and running without the hassle and expense of operating your own server.
It's a really useful service. There is something of a learning curve, but their guides are very useful and there is a lot of support available. My experience with it has been overwhelmingly positive, I suggest you take a look if you are in the market for somewhere to host your web applications.
Streamlit is a relatively new package focused directly at the Python data community, with the intention of making it easy to build and deploy applications to share results, models and analyses with others.
Honestly I think Streamlit could be a game-changer in this space. It allows easy creation of good-looking and interactive web applications using pure Python code, so it's very powerful. No more hours spent tinkering with HTML and CSS if all we want to do is share some results (although some of us quite enjoy spending hours tinkering!).
My first application in Streamlit was to deploy a Machine Learning model which can make predictions of the value of a house in California (in 1990) based on the parameters which the user can set. This was very easy to set up (relatively speaking), and I will definitely be using Streamlit more in future for quick prototyping and deployment of data applications.
I wanted to create a web application where I could use my Python skills to power the back end. I chose to use the Flask framework to do this.
Next I created a fun text generator app called Holla if ya hear me. More info about that is here. That was also really fun. I'm now quite comfortable using Flask to build web applications in Python, and I look forward to building some more complex apps in future.
I wanted to share some knowledge I had acquired recently about working with Python and MySQL together, so I wrote this piece to guide people through the process and hopefully show some of the power of combining these two languages for Data Analysis.
I was very excited to publish this one in freeCodeCamp's publication. If you're not familiar with freeCodeCamp, it's a non-profit organisation dedicated to providing coding education in the form of tutorials and courses, for free, to anyone who wants it. They have a huge range of courses available on many languages and aspects of technology. I have used and appreciated their resources, so it was great to be able to give a little bit back by authoring this article.
Following on from learning HTML & CSS to build this website you are reading right now earlier this year I wanted to put my new abilities to further use.
For the last five years I have been using (and paying for!) Squarespace to build the website for my English teaching company. As a challenge (and to save a not-insignificant amount of money) I decided to re-design and build the website again myself. I'm happy with how it turned out.
I have had this course from Google bookmarked for a while as I have wanted to get into ML for a while, and this is a great (and free!) option to cover the basics. Google obviously know a thing or two about ML, and TensorFlow is one of the top options out there for building models, so it was a really useful introduction.
Having now built and experimented with Deep Neural Networks I am keen to dive further in and start building models to solve real problems.
I have heard great things about Harvard's intro to Computer Science course, and I decided to give it a look beginning in July. I am happy to report that the hype is justified!
I don't come from a CS background, so it's good to go through an into course. A lot of the time it's like revision of things I'm familiar with, but there are frequent "Oooooh!" moments as something slots into place or a connection is made that I hadn't thought of before.
The instructor, David Malan, is really excellent, and I have found myself being really excited for each week's material. Plus the whole course can be audited for free, with access to the cloud-based IDE and automated checking of the weekly problem sets. Good stuff!
First series (third, fourth and fifth articles) published in Towards Data Science - MySQL Tutorial Series
As a way of wrapping up my recent dive back into SQL, I wrote a series taking the reader from conception to implentation and then to analysis of a relational database in MySQL. I again chose to publish on Towards Data Science.
This was a lot of work to put together, but I really enjoyed the process. As always, attempting to teach a topic helped me to really solidify my command of the material myself - I strongly recommend writing tutorials when you feel like you've understood something, it forces you to confirm that you really do!
This was meant to be just one post, but I had so much I wanted to write about the topic that it grew to three:
I have used Google Analytics to monitor traffic and activity on my teaching business website, but I wanted to get to know more about it, as it is an essential tool for all larger businesses with an online footprint.
I took several courses at the Google Analytics Academy, and capped it off by passing the Google Analytics Individual Qualification. Now I feel even more confident in my understanding of the platform.
I took the time to refamiliarise myself with and dive much more deeply into creating and updating databases using SQL. I used SQL fairly extensively for query-writing back in my banking days, and I wanted to work on my database design and building skills. I really enjoyed the freeCodeCamp course (focused on MySQL), and I also took several DataCamp courses which helped deepen my understanding and expose me to PostgreSQL as well.
I had used SQL queries in Python code previously, but I wanted to get into designing a schema, building and populating a database from scratch and so on, as that’s a very important part of any real-world data application, and I just found it really interesting.
I had some experience with Tableau and other BI software from my financial services days, but I wanted to expand my knowledge and see what new features were available.
I spent a lot of time going through their elearning paths and working in Tableau, making visualisations and dashboards. I was impressed with how quickly it’s possible to put together something that can engagingly convey a lot of interactive data to the people in an organisation who need that information. I can see why Tableau is widely used by enterprises, although the nerdiest part of me still really enjoys the process of typing out raw Python code.
I could easily see a workflow of performing data cleaning & analysis using the PyData stack, then using Tableau to quickly and easily visualise the information and share it with stakeholders being very effective.
Third article published in Towards Data Science - Creating an interactive map in Python using Bokeh and pandas
Continuing my theme of giving back to the Python code community, I wrote up what I had learned about using Bokeh to create an interactive visualisation as a tutorial, and was delighted that it too was accepted and published on Towards Data Science. I received some really positive feedback on this one - people seemed to enjoy it and find it useful - which I found really gratifying.
I decided that if I wanted to show people my work I would need to establish a portfolio website. I had previously made a very nice-looking website for my English teaching company using Squarespace, which was easy and looked great, but was very expensive. I was inspired by my twitter community and seeing the cool things people on the #100DaysOfCode hashtag were doing to think that I could probably give that a go myself.
I made my own attempt at a bar chart race using Matplotlib’s animation api and the gif library in Python. I wanted to learn how to make animated visualisations using Matplotlib, and this was a good use case for me to learn how that works.
I also used the same dataset with the then-just-released bar_chart_race Python package, which does a similar thing but in a more pre-packaged (and significantly easier for the end user) way.
Second article published in Towards Data Science - Where to swim in Berlin!
I wanted to work with some dirtier real-world datasets, so I explored the data available via Open Data Berlin. I found some interesting (to me at least) data about Badestellen (bathing spots) in the city and did some visual EDA using pandas to find out where to move if you want to have the most outdoor swimming options (surely the critical metric to consider when choosing a new place to live).
I was very pleased that the tutorial / article that I wrote about it was also accepted and published in Towards Data Science. This was positioned more as a ‘beginner’s guide’, trying to give people just getting to grips with pandas some exposure to its power and some ideas about how to use it to join data sets and extract insights from them, and to direct people to the best resources I had used in my own learning journey to that point.
First article published in Towards Data Science - Mapping Avocado Prices in Python with GeoPandas, GeoPy and Matplotlib
This was the ‘fruit’ (see what I did there?) of my own work on personal projects where I was really putting pandas to work, using Jupyter Notebooks and a variety of Python libraries. I set myself a challenge, found a question that I wanted to answer (and in this case, a visualisation that I wanted to produce) and kept going until I got there. There were ups and downs (scroll through my twitter timeline in February and March 2020 if you want to follow that emotional rollercoaster!) but with help from my analytics community on Twitter, along with countless articles on Towards Data Science and answers on Stack Overflow, I made it to my goal.
I’m a firm believer in the idea of paying things forward - something I’ve found really inspiring about the data community I’ve met as I’ve been on my coding and data analytics journey is how generous people are with their time and expertise. I wanted to make my own contributions to the community to leave something to help other people who might face similar challenges, so I wrote my first article providing a how-to to help future generations of analysts learn from my experiences.
After reading so many articles on Towards Data Science I was absolutely delighted when they agreed to publish my own piece.
In January I decided to dive into Data Analysis and really go for it in terms of teaching myself. I surveyed the landscape and chose learning Python (focusing on the PyData ecosystem) as my point of entry. I signed up for a year’s subscription to DataCamp and immediately started working my way through their ‘Data Scientist with Python’ program.
I had played with Python before, but this was my first really serious attempt, and my first time working with libraries that I would grow to love like pandas and Matplotlib. As I progressed through the courses it became clear to me that taking tutorials would not be enough, but working on my own projects was the best way for me to really get to grips with the tools I was learning.
I have learned a huge amount since then, and I still have a huge amount to learn about using Python for Data Analysis. That’s what I find really exciting about it!
At the end of 2019 I became interested in learning more about Scrum and other Agile methodologies, after an in-depth discussion with a friend back in Scotland who is working as an Agile Coach, and having worked in my capacity as an English teacher in a number of companies employing variations of this methodology.
It’s plain to see the benefits which an Agile approach offers companies, especially tech and software companies where the iterative methodology is a natural fit, but also other types of product and service. I started using some of the tools in my own work and even in my personal life (when my family & I moved flat in March 2020 I set up a Kanban board that really helped us manage that whole project. I felt very cool!). I decided to formalise my knowledge by taking the Professional Scrum Master I certification.
Grundqualifizierung für die Unterrichtsarbeit mit Erwachsenen im Fachbereich Allgemeine Erwachsenenbildung - Berliner Senatverwaltung für Bildung, Jugend und Familie
The "Basic qualification for teaching work with adults in the field of general adult education" is a German-language qualification offered by the Berliner Senat, Humboldt University and the Berliner Volkshochschule as part of their effort to professionalise the workforce of teachers working with the Volkshochschule. To do this I took a series of different courses all themed around pedagogy and teaching adults, including auditing a semester at Humboldt University.
This was challenging to do in German - my second language - but I found it really enjoyable to be pushed in that way, and the variety of courses was really enjoyable. I learned a lot working through the various courses included here, and met a lot of other professionals and gained a lot through the Erfahrungsaustausch (exchange of experiences). It was also great to do a semester at Humboldt University!
Deutsch-Test für Zuwanderer (DTZ) - Bundesamt für Migration und Flüchtlinge
Living and working in Germany since 2014, my German language skills have become pretty good (although there is still a lot of room for improvement!).
Without wishing to confirm any cultural prejudices regarding Germany, I will simply say that it is often helpful here to have a certificate to confirm that you can do the things you claim to be able to do. With that in mind I signed up for and passed the DTZ (German Test for Immigrants). I received 99% in the speaking section (I’d dearly like to know where I dropped that 1%!), confirming that my German language skills were above B1 level back then.
CELTA – Cambridge English
For a wide variety of reasons (ask me when you interview me!) I changed career in 2014, leaving Financial Services behind to move to Germany and become an English Teacher. To do that I chose to take the most widely recognised qualification for EFL teachers – the Certificate in Teaching English to Speakers of Other Languages (CELTA). This was an intensive course of study which required facing lots of new challenges and throwing myself into learning how to teach as well as discovering and clearing up gaps in my own knowledge of the English language.
This set me up very well for my successful career as an EFL trainer, where I have also had the opportunity to massively improve my communication and speaking skills.
Investment & Finance
I wanted to further my knowledge and gain some professional certifications to back up my practical knowledge, so I took a series of courses and exams focused on investment management (my professional field from 2006 - 2014).
I started with the Investment Operations Certificate from the CISI, then moved on to the Investment Management Certificate from the CFA Society of the UK, and then started working towards becoming a Chartered Financial Analyst (CFA). I completed Level 1 of the 3 CFA levels, but due to moving to a new company which didn’t support the CFA program, and then changing career entirely in 2014, I didn’t complete levels 2 & 3. I learned a huge amount about financial analysis in the course of completing my Level 1 certification however.
MA(Hons) First Class – Sociology, University of Edinburgh
In 2006 (a long time ago now!) I graduated from my four-year degree course at the University of Edinburgh with a First Class Honours Degree in Sociology. I learned a huge amount during those years, including my first forays into quantitative analysis. Ah, SPSS, my old friend!
After my MA I decided not to continue in academia, and moved into industry. I found work in financial services (more details on my CV or LinkedIn profile), where I spent time building fair value pricing models, ensuring pricing data quality and managing treasury data, among a wide range of other things.