This week, I sat down with my advisor in a Google Meet and we worked out how to deploy my Machine Learning model to the cloud. To do this, we use Microsoft Azure, which is a cloud computing tool that can do much more than deploying models, but that’s all we need it for.
The model that I have created throughout the internship is split up into two major parts. The first is the formatting, done by a tool called a TFIDFVectorizor, which takes the raw text of the article, and based on the frequency of terms in the text, it terns the text into a number. The list of numbers that has been transformed by the TFIDFVectorizer is then received by the second part of the model, which is the PassiveAgressiveClassifier (PAC). The PAC is fed two lists. One is the afore mentioned list of numbers from the TFIDFVectorizer, the other is a list that consists of the correct answers to each of the numbers, namely REAL or FAKE. The PAC takes both lists and looks at the number, classifies it over whether it is fake or real, then looks at the correct answer in the second list. If it is correct, nothing changes and it moves to the next number, but if it is wrong, it changes the rules on how the numbers are classified and continues on to the next number with these new rules. It repeats the same steps through the list for as long as the code tells it to, I told it to run 100 iterations, so it goes through the list and refines its rules 100 times.
So, why does it matter that I have 2 parts to my model?
When I want to export the model as a file, I need to export it twice, once for each part as I can only export one element using the exporting method used on machine learning models.
Why does having 2 separate files matter?
When you upload a model to Azure, the cloud computing tool, it only takes one file as each model, meaning that my model has effectively become two models that have nothing to do with each other from Azure point of view, meaning that I need to connect them in some way so that the input would come into the TFIDFV and the output of that would loop into the PAC model and the output of the PAC model would be sent to me. This is beyond what both me and my advisor know how to do. So we stopped at deploying both of these models as separate files. My model is technically in the cloud but it is a sorry excuse for a machine learning algorithm after having been dissected by Azure.
This ends my internship. In the coming weeks, I will be focusing on my presentation on the literary review I wrote on Algorithmic Bias in Machine Learning.