Import CSV with BeanHub Import
We have already gotten the bank transactions as CSV files from the previous step, either by manually downing them from the bank's website or using BeanHub Direct Connect. Now what? We can always find repeating transactions if we look at our transaction data carefully. Be it your rent, internet service fee, or mobile data plan. This kind of transaction appears again and again periodically. Also, we usually categorize repetitive purchases from the same merchants as the same type of expenses. Businesses also run payrolls for employees regularly. In the end, there are only very few unexpected or one-time transactions. The key to successfully making your accounting book as fully automatic as possible is to have the software run through all those transactions with pre-defined rules and create corresponding accounting entries automatically based on data imported from the bank.
Different kinds of tools in the plaintext accounting community can help you import transactions from CSV files and various sources. But usually, data are in different shapes, making it hard to work with. Many tools also couple the process of extracting data and transaction generation in the same tool, making it very hard to reuse the same logic elsewhere. To solve those problems, when building our open-source tools for importing Beancount transactions, we break down the responsibility of extracting and importing. For the extracting part, we built beanhub-extract. It's a simple library to extract CSV files and potentially files in other formats and then provide a standardized data structure for beanhub-import or other import engines to consume.
Here are the currently available fields in the Transaction
data structure beanhub-extract provides:
extractor
- name of the extractorfile
- the filename of import sourcelineno
- the entry line number of the source filereversed_lineno
- the entry line number of the source file in reverse order. comes handy for CSV files in desc datetime ordertransaction_id
- the unique id of the transactiondate
- date of the transactionpost_date
- date when the transaction postedtimestamp
- timestamp of the transactiontimezone
- timezone of the transaction, needs to be one of timezone value supported by pytzdesc
- description of the transactionbank_desc
- description of the transaction provided by the bankamount
- transaction amountcurrency
- ISO 4217 currency symbolcategory
- category of the transaction, like Entertainment, Shopping, etc..subcategory
- subcategory of the transaction, like Entertainment, Shopping, etc..pending
- pending status of the transactionstatus
- status of the transactiontype
- type of the transaction, such as Sale, Return, Debit, etcsource_account
- Source account of the transactiondest_account
- destination account of the transactionnote
- note or memo for the transactionreference
- Reference valuepayee
- Payee of the transactiongl_code
- General Ledger Codename_on_card
- Name on the credit/debit cardlast_four_digits
- Last 4 digits of credit/debit cardextra
- All the columns not handled and put intoTransaction
's attributes by the extractor goes here
What's beanhub-import and how it works
Now, with beanhub-extract, we can easily extract transaction data from different sources as a standard data structure. Next, it would be the job of beanhub-import to look at those transactions provided by beanhub-extract and see what rules they match, then generate corresponding Beancount transactions for you. Unlike most Beancount or other plaintext accounting importing tools, beanhub-import not only generates the transactions for you but is also smart enough to look at your existing Beancount transactions and update them for you. Here's how it works:
Step-by-step example
Now you know how beanhub-import works, let's see an example and show you how to do it step by step. Before that, you need to install BeanHub-CLI first. You probably already did it if you've followed the guide for pulling bank transaction CSV files from BeanHub Direct Connect. If not, it's very simple. You only need to ensure you have Python greater or equal to 3.11 installed. Then, you can run:
pip install "beanhub-cli>=2.1.0"
Next, let's define the first simple empty beanhub-import rule file at .beanhub/imports.yaml
with content like this:
FIXME
You must also ensure you have at least the main.bean
Beancount file in your current folder.
If not, you can create one with the following content.
FIXME
Now, you can run the import command of BeanHub-CLI by:
bh import
And you will see output like this:
FIXME
What just happened is that the import command reads your import rule file at .beanhub/imports.yaml
and tries to import transactions based on the rule from the input sources.
Because the file contains no input and rules, there is nothing the import engine can do.
TODO: example