Discussion about this post

User's avatar
Marcos H's avatar

This is a great post, with lots of good ideas for low cost hosting. I have solved most of these problems in a far less-cheap way, I have a Digital Ocean droplet (i.e. my own server in the cloud.)

But, having a server solves quite a few problems on the list: :

* I can build or copy parquet files or sqlite files on/to the server - avoiding the github limit

* I host 4 static sites on 4 different domains (and can add more when needed)

* For low-cpu requirement scripts, I run cron jobs on the server to rebuild databases / update dat - like running the bot that posts for my hurricane bot @nhc_atlantic_bot@vmst.io on Mastodton.

* For visualizations I can run Dash or Shiny apps (though I have no active shiny server running at the moment)

It's a little trickier for more ram/cpu hungry things - but have an always-on desktop machine home (my Mac Mini) and it runs more intensive scripts (like building the semantic similarity and search for recordedvote.org) and then copies that via scp to the server. Most people have laptops not desktops.

Similarly, my personal home page is quarto and complex enough I build on my Mac and sync to the server over ssh.

You are much better about staging / prod workflow. I think I tested a major rewrite of the backend of Recordedvote.org on an alternate domain name / branch? Maybe? But most of the time I test the Dash app on my Mac, and push to prod. 🙃 YOLO.

All this said, while this works for me, it requires a lot of linux server management skills that I think most data scientists would just as soon not acquire. Plus, the droplet is $14/mo - I have thought about moving to Hetzner to save some money but migration would be tedious.

For some reason, I'm like allergic to github actions. I'd rather fiddle with my cronjobs on my server than write YAML I guess?

Expand full comment

No posts