Machine Learning Things
Machine Learning Things is a lightweight python library that contains functions and code snippets that I use in my everyday research with Machine Learning, Deep Learning, NLP.
I created this repo because I was tired of always looking up same code from older projects and I wanted to gain some experience in building a Python library. By making this available to everyone it gives me easy access to code I use frequently and it can help others in their machine learning work. If you find any bugs or something doesn't make sense please feel free to open an issue.
That is not all! This library also contains Python code snippets and notebooks that speed up my Machine Learning workflow.
Table of contents
-
ML_things: Details on the ml_things libary how to install and use it.
-
Snippets: Curated list of Python snippets I frequently use.
-
Notebooks: Google Colab Notebooks from old project that I converted to tutorials.
ML_things
Installation
This repo is tested with Python 3.6+.
It's always good practice to install ml_things in a virtual environment. If you guidance on using Python's virtual environments you can check out the user guide here.
You can install ml_things with pip from GitHub:
pip install git+https://github.com/gmihaila/ml_things
Functions
pad_array [source]
def pad_array(variable_length_array, fixed_length=None, axis=1)
| Parameters: | variable_length_array : array Single arrays [1,2,3] or nested arrays 1,2],[3. fixed_length : int Max length of rows for numpy. axis : int Directions along rows: 1 or columns: 0. |
| Returns: | numpy_array : axis=1: fixed numpy array shape [len of array, fixed_length]. axis=0: fixed numpy array shape [fixed_length, len of array]. |
Example:
>>> from ml_things import pad_array
>>> pad_array(variable_length_array=[[1,2],[3],[4,5,6]], fixed_length=5)
array([[1., 2., 0., 0., 0.],
[3., 0., 0., 0., 0.],
[4., 5., 6., 0., 0.]])
batch_array [source]
def batch_array(list_values, batch_size)
plot_confusion_matrix [source]
plot_confusion_matrix(y_true, y_pred, classes='', normalize=False, title=None, cmap=plt.cm.Blues, image=None,
verbose=0, magnify=1.2, dpi=50)
download_from [source]
download_from(url, path)
Snippets
This is a very large variety of Python snippets without a certain theme. I put them in the most frequently used ones while keeping a logical order. I like to have them as simple and as efficient as possible.
| Name | Description |
|---|---|
| Read FIle | One liner to read any file. |
| Write File | One liner to write a string to a file. |
| Debug | Start debugging after this line. |
| Pip Install GitHub | Install library directly from GitHub using pip. |
| Parse Argument | Parse arguments given when running a .py file. |
| Using Doctest | How to run a simple unittesc using function documentaiton. Useful when need to do unittest inside notebook. |
| Unittesting | Simple example of creating unittests. |
| Sort Keys | Sorting dicitonary using key values. |
| Sort Values | Sorting dicitonary using values. |
Notebooks
This is where I keep notebooks of some previous projects which I turnned them into small tutorials. A lot of times I use them as basis for starting a new project.
All of the notebooks are in Google Colab. Never heard of Google Colab? 🙀 You have to check out the Overview of Colaboratory, Introduction to Colab and Python and what I think is a great medium article about it to configure Google Colab Like a Pro.
If you check the /ml_things/notebooks/ a lot of them are not listed here because they are not in a 'polished' form yet. These are the notebooks that are good enough to share with everyone:
| Name | Description | Colab Link |
|---|---|---|
| Pretrain Transformers | Simple notebook to pretrain transformers model on a specific dataset using transformers from Huggingface | |
Final Note
Thank you for checking out my repo. I am a perfectionist so I will do a lot of changes when it comes to small details.
Lern more about me? Check out my website gmihaila.github.io!