GSoC'20 progress : Week 1

Progress during the first week of the coding period

June 10, 2020 - 3 minute read -
GSoC 2020 KDE

GSoC 2020 with KDE

Greetings Reader,

The coding period began on June 1st. The plan for the first phase is to complete the following tasks:

  • A script that generates the SQLite DB using blz datafile downloaded from the Deutsche Bundesbank
  • Modifications in the CMake build system to call this script at build time.
  • Replacing the part of code that uses the text datafile with the code that reads from this database.
  • Updating tests, documentations and benchmark to work with the new database.
  • Modifying the command-line tool to enable support for user-supplied database.

Week 1

The first work for this week was to write a python script to generate the database using the bank data file.
As the work progressed, the first challenge revealed itself. The thing is, currently, the program downloads the currently valid bankdata at every build. So, the data is always read from the latest file. Now, when a single database will be used, the database needs to keep track of the deleted entries and update only the deleted and added entries. Another solution was to regenerate the database from scratch everytime, but that would have lead to loss of information of the deleted entries.

To keep track of this date upto which the entries are valid, I proposed to add another field in the database named valid_upto. This valid_upto field will be filled if during any data update the datafile from the bank marks it deleted.

Another problem was to how to get this date. The bank does not include any date info in the file. I was stuck on this for a day. But I figured it out eventually. I used the CMake REGEX MATCH command on the website source file to extract this date. Figuring out the regex took some time. CMake regex is weird and the documentation is very limited. The web was not being much of a help. But, got through. Success at last, when I ran cmake and the date finally printed on the terminal!

The complete python script could be found here.

To ensure that the database is not generated from scatch at every build, I used this method to check for the DB in the output path.

def existDB(file):
    """ Checks if the database file already exists
    """
    if not isfile(file):
        return False

    if getsize(file) < 100:
        return False

    db = open(file,'rb')
    header = db.read(100)
    if header.startswith(b'SQLite format 3'):
        return True

    return False

I created two methods to add and update entries to the database.

submitInstitute(bankCode, method, bankName, bic, location)

This is used for inserting an entry into the database. For such entries, no valid_upto value is given. So, NULL is inserted in its place.

deleteInstitute(oldBankCode, newBankCode, method, bankName, bic, location, valid_upto)

This method updates the entry corresponding to the oldBankCode and sets the valid_upto field. It then checks if the newBankCode is a valid entry and if positive creates a new entry.

After finishing the python script, I worked on the CMake files to handle the database generation. A lot of small changes here and there and the work of Week 1 was completed successfully.

The database generates on build time with expected values. Now, on to the part to actually using it.

See you in the next post.

:wave:
Prasun.