timfi

`$ whoami`

My name is Tim Fischer, I'm a software engineer and computer science PhD student at the University of Tübingen. I live in south-western Germany, to more specific in the county of Böblingen, about 30 minutes south of Stuttgart and 20 minutes north of Tübingen. I work at my unisversity as a research assistant in the Database Systems Research Group. In my free time I do many things, for the most part I code around a bit, playing with some side projects from time to time. But the rest of the time I spend hanging out with some friends and doing gymnastics.

`$ ls ~/research`

Over my time as a PhD student in Tübingen I've been steadily working on compilation methods that target database systems as runtimes. I may not get around to updating this list as soon as new work is published, so if you are interested you can look at our chairs webpage (db@tue) or my orcid (0000-0002-6625-9627 ).

To Iterate Is Human, to Recurse Is Divine — Mapping Iterative Python to Recursive SQL

BTW2023 - Datenbanksysteme für Business, Technologie und Web

Student Paper (Best Student Contribution Award) • doi:10.18420/BTW2023-73 • db@tue

Abstract: Writing complex algorithms and iterative computations in SQL is difficult at best, commonly leading to code that intermingles looping control flow with database access. This yields programs with control flow that rapidly hops in and out of the database, with each roundtrip incurring significant overhead. We present the ByePy compiler, which can compile entire Python functions directly to plain recursive SQL:1999 queries. By doing so, the compilation eliminates all but a single roundtrip, leading to runtime speedups of up to an order of magnitude.

Snakes on a Plan: Compiling Python Functions into Plain SQL Queries

SIGMOD/PODS '22: International Conference on Mangement of Data

Demo Paper • doi:10.1145/3514221.3520175 • db@tue

Abstract: "Move your computation close to the data" is decades-old advice that is hard to follow if your code exhibits complex control flow. The runtime of such applications suffers from a continual back and forth between database-external code execution and plan-based SQL evaluation. We demonstrate the ByePy compiler which translates entire Python functions with arbitrary control flow-including deeply nested iteration-into plain recursive SQL:1999 queries. The invocation of a ByePy-compiled function enters the database engine once to execute the plan of a single query. Computation does not get much closer to the data than this. The system rewards this translation effort from Python to SQL with runtime improvements of up to an order of magnitude.

ByePy: Compilation of Python to SQL

M.Sc. Thesis • db@tue

Abstract: Databases have become ubiquitous in the world of application development and have found their way into the toolbelt of many python developers. Sadly, many of these developers still struggle with the database world’s de facto lingua franca SQL. This is no fault of their own; SQL differs drastically from imperative Python programming in its declarative nature. The struggle with SQL has led to many writing code interweaving complex control flow and database access. Such code exhibits poor performance compared to even the most complicated queries. The constant context switches between the Python interpreter and the underlying database add up quickly, resulting in a significant amount of time being wasted on overhead that does not further the computation itself. To address this issue, we present ByePy — a Python to SQL compiler, which can compile entire Python functions with arbitrary control flow into plain recursive SQL:1999 queries. Compilation with ByePy is rewarded by runtime speedups of up to an order of magnitude.

Utilizing parallelism in the two-phase approach of translated fsUDFs

B.Sc. Thesis • db@tue

Abstract: When analyzing recursive functions it is common to look at the function calls by building a call graph for an example input. Doing so a lot of the time we face the fact that we should be able to parallelize the parts of said call graph, that are mostly independent of each other. But some languages and systems make this process harder for us than it should be. One such system is the DBMS PostgreSQL, even though it allows for recursive UDFs they are poorly optimized and only viable for inputs yielding minimal amounts of recursive steps. To parallelize this directly is a chore in and of itself, but we can make use of a method introduced by Christian Duta and Torsten Grust in their 2020 paper “Functional-Style SQL UDFs With a Capital 'F'". They propose a method to translate a given 'functional-style' UDF into a query that frst builds the call graph and then walks said call graph to calculate the result. Injecting parallelization into this two phase approach is what this work focuses on. We propose a simple method of extending the two phases such that the building of the call graph also partitions said call graph into disjointed, or at least mostly disjointed, sub-problems and that the evaluation uses said information to actually run these sub-problems in parallel. We found during our testing, that there can be major benefts for certain problems, but also that not all recursive problems lend themselves to being parallelized. We found that in general the more tree-like the call graph is the more benefts can be gained from parallelization.

`$ ls ~/projects`

Most of my projects can be found either on github or on codeberg . The following are some of the ones I see as highlights:

`befunge-sql`

A graded project from university, the gist of it was to flex SQL's muscles a bit and implement some complex computational problem entirely inside of a single query. To that end, I opted to implement an interpreter for the esoteric programm language Befunge.

`dfb_predict`

A graded group project from university, the gist of it was that we needed to develop an application to predict game in the DFB 1. Liga, the German football league. It encompassed everything from data collection and processing, match prediction using stochastic methods, and UI development for an easy to use interface.

`pycells`

An implementation of simple cellular automata in python, but with a few twists. Said twists pertain to the some implementation details as—to make it more challenging—I decided to setup some limitations before actually starting out. The four core limitations I set myself were the following:

The simulation functions should be able to handle n-dimensional rules and states.
The rules should be given as integers. (Though this limits the rules I need to be able to simulate.)
The state should be represented as an integer, making use of the fact that the whole state consists of a n-dimensional array of booleans that can be flattened.
The simulations functions should make use of the integer-nature of the state representation and should minimize the use of iteration.

`bf.py`

A fully featured interpreter for Brainfuck in a single Python expression. I decided that just golfing it wasn't enough and as an extra hurdle I should only use a single expression, i.e. a "true oneliner".

`$ cat contact.info`

email: me@timfi.dev
github: @timfi
codeberg: @timfi
orcid: 0000-0002-6625-9627
work: db@tue