I'm wondering how I would be able to regression-test functions in my project (pgflow [0]) - it tracks a graph of state machines modeled in few tables. State is mutated only by calling few exposed SQL functions (task queue worker does it).
Given I can't enforce everything I need with check constraints and I try to avoid triggers if possible, I opted for only using exposed SQL API [1] for setting up state in my pgTAP tests.
It is imperative and harder to maintain, like scripts you described in the article, but really my only option, as I want to have maximum confidence level.
Does RegreSQL support some kind of init scripts or I would need to wire it myself and just run RegreSQL after the proper state is set? Would lose the "run once and get report on everything" benefit then :-(
[0] https://pgflow.dev/ [1] https://github.com/pgflow-dev/pgflow/blob/main/pkgs/core/sup...
Plus I have whole set of other requirements where RegreSQL suddenly seems to be a good solution.
And without sounding cliche - Thank you for the comment! This is exactly why I forced myself to go public and get this level of feedback.
I was postponing proper, dedicated performance testing for some time and would really love to up my game in that regard.
I'm very happy with pgTAP approach of running stuff in transaction and rolling them back after the test - how this works in RegreSQL?
Would love to provide feedback and test the hooks when you will be working on them. I'm mostly interested in performance testing and my use case would be to run them on CI and compare to previous metrics stored somewhere in order to fail CI when performance regressions are introduced.
Happy to connect, got contact info in my profile.
I will definitely reach out, just give me bit of time to mentally recover from the exposure and got some meet ups where I promised to deliver some presentations and they will consume a lot of my spare free time.
You can instruct postgres to raise exceptions using the same error code that constraints use: that way your clients do not need to know the difference.
That said, there ARE other places that would benefit from triggers (aggregate counts). I've avoided them because they're hot paths and I was worried about perf impact - relyi ng on pgTAP coverage instead.
Your defense-in-depth argument is solid though. I should revisit this and benchmark whether the safety is worth the perf cost. Something like RegreSQL would come in handy
So it would be nice if RegreSQL would support fixture hooks for those who like this route.
Would also be fun if you could support PGLite [0], that's what I've been using to write "unit" tests connected to a "real" database.
[0] https://github.com/eradman/ephemeralpg [1] https://github.com/theory/pgtap
I deal with a lot of complex querying logic with mostly Elasticsearch. My appproach is to either unit test or integration test and just skip everything in between. With queries what I care about is that they work under realistic scenarios against the same version of Elasticsearch that we use in production. Creating test fixtures is expensive. So tests for read only querying shares the same test fixtures. That speeds things up. I don't care about separately testing simple crud operations because most of my scenarios trigger plenty of those. A unit test for that has no value to me. Unit testing whether my API can send queries to some fake Elasticsearch, empty Elasticsearch, etc. has limited value to me. There's some limited value in knowing the queries I'm sending are syntactically correct. But you get that for free with a proper integration test not failing. The unit test is redundant if you have a proper integration test. And a unit test without a proper integration test provided little practical value.
What I actually do care about is all the complicated dashboard and other queries that make lot of assumptions about how data is structured, what fields are there, how they are indexed, whether they can be null, blank, or have invalid values, etc. work as intended. That kind of calls for an integration test. Anything trivial enough that a unit test would be good enough doesn't tend to need a lot of testing. Any scenario that touches enough of that stuff, kind of covers most of that.
I put a lot of effort in ensuring that my integration tests can run quickly, concurrently, and don't interact with each other (data randomization). That allows me to get away with not deleting a lot of data between tests and gets me a lot of realism for free because real users don't have an empty system completely to themselves. So having a lot of tests running against a busy system is integration testing gold. I have close to 300 full API integration tests running in around 30 seconds on my laptop. Close enough to unit testing performance that I run them many times per day.
The same approach applies to database testing. Probably more so because all the interesting bugs usually relate to constraints, transactionality, database locks, etc. If you have flaky tests because of that, it might actually be because your database layer has some issues with the notion of users not being polite enough to queue up one by one.
This is not for everyone, I realize. Do what works for you. I've butted heads with people over this more than a few times. But in my company (of which I'm the CTO), we unit test functions, small classes, regular expressions, parsing logic, etc. We integration test systems and APIs. Testing individual queries without the rest of the system is hard and pointless. Test the API that triggers the query. That 30 second performance for test runs is something I spent a lot of time on getting. It means we can do major changes without fear. If tests pass, our users should be fine.
With current feature set it's something I have already considered but still undecided.