git lfs track "*.csv"
Git Large File Storage
Learning Objectives
- Versioning large files in git.
- Git Large File Storage
Motivation
Git was designed to track small changes in small text files.
So by default, it is not well-equipped to handle larger files (e.g. greater than 100 MB).
But many datasets (in particular the ones you will use for the final project) are larger.
Installation and Usage
Git-LFS is an extension to git to version large files. You can install it and use it with the following steps:
Once per computer, install git-lfs:
- Ubuntu : Open up the terminal and run:
sudo apt-get install git-lfs
- Mac OSX: Open up the terminal and run:
brew update
brew install git-lfs
- Windows:
- Download the windows installer from: https://git-lfs.github.com/
- Run the windows installer
- Ubuntu : Open up the terminal and run:
Once per repo, set up git-lfs:
Open up the terminal.
Navigate to the repo where you want to use git-lfs.
Run
git lfs install
Select the types of files that you would like to place under git-lfs. These should be the large data files. For example, to place all CSV files under git-lfs, run:
Commit the hidden file “.gitattributes”
git add .gitattributes git commit -m "Add .gitattributes"
Use git as you normally would.
If you accidently committed large files before running
git lfs install
, then you can retroactively update your repo to commit them with git-lfs by:git lfs migrate import --include="*.csv" --everything
The above code is for CSV files. You should change the above code based on the file type.
Paying for storage
GitHub charges money for storing large files.
I used an education discount to obtain free data storage for all of the repos in our class organization. So you won’t need to pay anything.
To store large files for your personal repos (outside of your classwork), click on your photo at the top right of the the GitHub website. Go to Settings > Billing > Add more data. You can then choose the data plan you want.