Skip to content


aToad #24: Bulk Rename Utility and Duplicate Files Finder

The names are pretty self-explanatory… A mass file renamer and a tool to find duplicate files

Bulk Rename Utility is a Windows freeware that makes it “easy” to mass rename files.
On the plus side, it has plenty of features, you can use regular expressions, use ID3 tags and EXIF meta data, change the case, add some numbering, preview file name changes, change the files creation date, etc, etc. On the minus side, that feature-richness comes at a price: it’s not very simple to use. Particularly when you don’t use it very often: since I rarely use it, most of the times I use it I have to check the help file first. Still, I find it pretty great.

Duplicate Files Finder is a multi-platform (Windows/BSD/Linux) free and open source software that allows you to detect duplicate files and delete them. Pick the directories you want to include in the scan and hit Go. Or optionally, you can include or exclude specific files names, and filter your search with minimal and maximal file sizes (excluding tiny files can speed up the search and make the results more readable).
The scan is pretty fast, as it first matches files with their size, and only then compares those that have the same size. The only big weakness of this software is that deletion of duplicates can only be done one by one, i.e. for every file that has duplicates, you have to manually select which one you want to keep. So, it’s good to deal with a few large duplicated files, but it’s not very effective to deal with many small duplicates (all the more reason to exclude smaller files from the search, IMO). Still, my typical use case is mostly on large-ish files that are not too numerous (or whole duplicate folders), so it does the job efficiently enough.

Posted in A Tool A Day.


Upgrading Ubuntu Server from a LTS to the next (non-LTS) version

The configuration file to edit should be shown when you run “do-release-upgrade”, as long as there is a new version after your LTS. But for some reason it wasn’t there (or I missed it) many times before when I previously ran this command. So at least now I won’t lose it anymore.

sudo apt-get update && apt-get upgrade (because the release upgrade command will require you to have all current updates)
sudo nano /etc/update-manager/release-upgrades
Change Prompt=lts to Prompt=normal
sudo do-release-upgrade

That’s pretty much it. Note that you can’t jump a version, so for instance if you’re on 17.10 and you want 18.10, you’ll have to upgrade first to 18.04 LTS and then to 18.10

Posted in Linux.


Renewing the Thecus N7510’s TLS certificate

The Thecus N7510 is a cheap NAS that used to be popular for its large amount of disks (7) while still being as cheap as (or even cheaper than) most 4-disks NAS.

It is powered by Thecus OS, but sadly it seems that its version of Thecus OS isn’t maintained very actively anymore. Particularly, the SSL/TLS certificate used for FTP over TLS expired about a month ago. Which is pretty annoying, because FileZilla refuses to let you permanently ignore a certificate expiration alert (for stupid reasons, but this isn’t the first time the FileZilla developers provide poor explanations for equally poor choices – we can only live with that).

So the only option I had left was to try to upgrade the NAS’s certificate by myself. Gladly, this turned out fairly easy, as I wrote a guide before on how to create your own self-signed certificate. So the only new (and minor) difficulty was to find where the current SSL/TLS certificate of the N7510 is. I quickly found that it’s named /etc/ssl/private/pure-ftpd.pem, which contains both the server private key and the signed certificate (something very slightly different from my previous guide: you just need to stash 2 files into one .pem file).

If they’re not already enabled, you need to enable SSH and SFTP from the ThecusOS control panel (the SSH & SFTP toggles are in Network Service > SSH)

Once this is done, here are the commands I used (cf the linked guide if you need more details) to generate the certificate:

cd /etc/ssl/private
openssl genrsa -des3 -out servPriv.key 4096
openssl req -new -key servPriv.key -out servRequest.csr
cp servPriv.key servPriv.key-passwd
openssl rsa -in servPriv.key-passwd -out servPriv.key
openssl x509 -req -days 3650 -in servRequest.csr -signkey servPriv.key -out signedStartSSL.crt

At this stage, you have everything you need excepted the “stashed” pem file.
At first, I tried to use nano to create it, but the Thecus N7510 doesn’t have nano 😡 So, I connected via SFTP (with FileZilla) as root (that’s why I told you to enable SFTP along with SSH earlier). Then I grabbed servPriv.key and signedStartSSL.crt, and put them both into a single text file (not sure if the order matters) name newcert.pem.

Just for the sake of clarity, newcert.pem looks like:

-----BEGIN RSA PRIVATE KEY-----
[base64 stuff]
-----END RSA PRIVATE KEY-----
-----BEGIN CERTIFICATE-----
[more base64 stuff]
-----END CERTIFICATE-----

Finally, I uploaded newcert.pem into /etc/ssl/private, renamed pure-ftpd.pem to pure-ftpd.pem.bak, and renamed newcert.pem to pure-ftpd.pem.

All is now ready, the last thing you need to do is to restart the FTP server. The easiest way to do it is to disable then re-enable it via the ThecusOS control panel (Network Service > FTP).

Now, when you connect with FileZilla to the FTP server, you’ll see your new, non-expired, certificate, and will be able to trust it permanently (that is, until it expires in about 10 years).

Posted in FTP, security, servers.


Buffing your Apache HTTPS configuration

Setting up HTTPS on Apache with a basic configuration is now both trivial and cheap. Optimizing it for a (slightly) better security level requires a bit more digging though. And a small trade-off: you’ll have to sacrifice fossil browsers, like MSIE pre-11, and generally most old versions of just any browser. Spoiler: noone really uses those anyway.

First, here is my old configuration. It still gets an A on SSL Labs as I’m writing this, but it’s starting to have issues.

<VirtualHost *:443>
   ServerName gal.patheticcockroach.com
   DocumentRoot "/home/gal/"
   <Directory "/home/gal/">
   allow from all
   Options -Indexes
   </Directory>
   SSLEngine on
   SSLProtocol all -SSLv2 -SSLv3
   SSLHonorCipherOrder On
   SSLCipherSuite ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-SHA256:AES128-GCM-SHA256:HIGH:!CAMELLIA:!RC4:!MD5:!aNULL:!EDH
   SSLCertificateFile /etc/letsencrypt/live/gal.patheticcockroach.com/cert.pem
   SSLCertificateKeyFile /etc/letsencrypt/live/gal.patheticcockroach.com/privkey.pem
   SSLCertificateChainFile /etc/letsencrypt/live/gal.patheticcockroach.com/fullchain.pem
   SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown
</VirtualHost>

Now, here is my new one:

<VirtualHost *:443>
   ServerName gal.patheticcockroach.com
   DocumentRoot "/home/gal/"
   <Directory "/home/gal/">
   Require all granted
   Options -Indexes
   AllowOverride All
   </Directory>
   Header always set Strict-Transport-Security "max-age=31536000"
   SSLEngine on
   SSLProtocol all -SSLv2 -SSLv3 -TLSv1 -TLSv1.1
   SSLHonorCipherOrder On
   SSLCompression Off
   SSLSessionTickets Off
   SSLCipherSuite ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256
   SSLCertificateFile /etc/letsencrypt/live/gal.patheticcockroach.com/cert.pem
   SSLCertificateKeyFile /etc/letsencrypt/live/gal.patheticcockroach.com/privkey.pem
   SSLCertificateChainFile /etc/letsencrypt/live/gal.patheticcockroach.com/fullchain.pem
</VirtualHost>

Note that this is using Apache version 2.4.29, while the old one was using something-older-not-sure-which-one. So, “allow from all” became “Require all granted”, and some new algorithms became available. But TLS 1.3 isn’t here yet.

First, I ditched SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown. Doesn’t really impact security, but is just useless now since the cipher suites we’ll pick aren’t supported by the MSIE versions that required this tweak.

Then, I disabled all SSL protocols but TLS 1.2. A more elegant way would be SSLProtocol -all +TLSv1.2, but I just wanted to keep the list for the moment. I’m actually not even sure if Apache still supports SSL v2, or even v3.
I handpicked some of the most modern cipher suites from here and there, disabled compression and session tickets (because reasons), and I added a Strict-Transport-Security header. About this last one, I believe a value of “max-age=31536000; includeSubDomains; preload” might be even better, 1) for preloading and 2) I’m not sure about includeSubDomains but I’ve seen it used in a bunch of guides.

And that’s basically it, already. With this I’m getting an A+ on SSL Labs and in other places. Most of which insist heavily on setting a very long HSTS (watch out, once you set it you have to keep maintaining the HTTPS version of your site, or people who already visited it won’t be able to access it anymore for a long while).

Last but not least, here’s a little list of services that you can use to test your HTTPS setup:
SSL Labs
HT Bridge
Cryptcheck

And here’s an even longer list, but sites other than those I already listed seem vastly inferior to me, with the exception of a few services that focus essentially on the “administrative” details of your certificate. Notably, this one will let you download the certificates that are missing from your chain, if any (it shouldn’t be useful, but it’s a fun feature still)

Posted in security, servers, web development.


An important bias to know about consumer reviews

In a previous life, I used, among other things, to search and avoid biases in scientific studies. Not the SJW kind of bias, the statistical kind of bias. Once you’ve acquired that mindset, you never fully abandon it and you just tend to casually check for biases everywhere they may exist.
A domain prone to bias is consumer reviews. If only because unhappy consumers tend to be more vocal than the happy ones. But that’s obvious and not what I’ll be writing about here.
A year and a half ago, I posted a short post about why Tomtop’s products all have high ratings. Long story short, there is what you could call a “selection bias” in customers’ reviews posted on Tomtop: all the reviews below 4 stars (or maybe the threshold is 3, I don’t remember, but it’s certainly not lower) are not “selected” (well, they’re plainly never published). I don’t know if it still applies today, but you could probably find out for yourself rather easily (just look for negative reviews for a while, and stop once you find one… or get tired of not finding any). End of the introduction story bias.

A few months ago, a UPS (uninterruptible power supply) I bought about 4 years ago became faulty. I know those things don’t last forever, notably because the battery wears out, but it wasn’t a “the battery is worn-out” issue, and anyway 4 years seemed a bit too short-lived. So I thought, eh I’ll leave a review on that product, which only had a few, all very positive reviews. I was thinking something along the lines of “well, it works, but it’s probably not as reliable as you’d expect”. And I wasn’t able to: the site where I bought it (a local site named LDLC) said, in its error message, that only customers who bought the product are able to review it. They still have the bill corresponding to my purchase, which I was able to download in my customer area, but the review error message tells me I didn’t buy it.
I assume their error message is at fault there, and that they rather mean that there is some time limit between the time when you made the purchase and the time when you can review. But then it means, you guessed it… there is a bias in the reviews: the reviews can be posted only during a limited time after purchases, meaning all potentially negative but important feedback about lifespan or reliability issues will be underrepresented.

So I thought okay, never mind reviewing the product, but I could still review the site that prevents me from reviewing an old purchase, right? Here comes Trustpilot, where I had already left a review about LDLC a long time ago (possibly after they suggested it, I’m not sure, but I’m really not a big user of such review sites). So I left a review on Trustpilot, explaining that LDLC didn’t let me review a product I bought long ago and which turned out to not last long. All done, or so I thought.

The following day, I received an e-mail from Trustpilot, with a rather cold and standardized note from LDLC saying (I tried to translate as faithfully as I could from French here): “For this review to be considered, an order number allowing to justify a consumer experience is required” (“Pour la prise en compte de cet avis, un numĂ©ro de commande permettant de justifier d’une expĂ©rience de consommation est requis”). It was followed by a standard message from Trustpilot, much more friendly, saying (again, translated from French, but it was easier to translate because unlike the other part it doesn’t sound robot-like): “Do you wish to send the information they asked? No pressure, you decide what you want to share.” I thought no thanks, I’m done talking to them, so I just let it slide. I got an automated follow-up 3 days later, which I let slide too.
8 days later, I receive another e-mail from Trustpilot, much different this time, saying LDLC reported my review because they “don’t think I had an authentic buying or service experience in the last 12 months”. As a result, the review was unpublished, pending additional information. I guess those LDLC assholes decided to go for it and gamble I wouldn’t react, in order to try to remove an embarrassing, truthful review (you don’t need to take my word for it, you can just check for yourself by trying to rate a product you ordered more than 4 years ago). Tough luck, I sent the required info… and made this post too because I felt I ended up with enough material to talk a bit about bias in reviews.

Not sure how to call this bias… “time bias” maybe? To sum it up:
– LDLC doesn’t allow reviews on old purchases (I could find no information about the delay during which you’re allowed to post a review after a purchase, all I know is reviewing my 4 year old purchase was impossible)
– Trustpilot also doesn’t allow reviewing old experiences (“an authentic buying or service experience in the last 12 months”, they say). It is unclear how it applies to my case: I want to review today an old purchase and I can’t: my experience of not being able to review happened less than 12 months ago, but the purchase that caused me to want to write the initial review is older. They asked me to provide proof of a purchase (not just use of the service) less than 12 months old. So most likely, if I hadn’t purchased something else in that timeframe, LDLC would have been able to remove the review even though it’s related to a (dis)service that happened less than a month ago.

All in all, when reading reviews, be aware of the “time bias” that the reviewing system may introduce. For instance, if a company screws someone over a purchase they did 12 months and 1 day ago, you won’t hear about that in a Trustpilot review (except maybe if the person had a more recent purchase at the same company). If LDLC sells products that last 4 years but not 5, you won’t hear about that in a LDLC review (except if the person buys the same product again just to give it a zero?).
Surely, other sites have similar time constraints. And sadly, it may be hard to be aware of them. In this case, neither LDLC nor Trustpilot prominently state that consumers are unable to leave reviews after a certain period of time. I can understand that choice in Trustpilot’s case, as verifying old proofs can be tedious. But as for LDLC, that’s just bloody convenient to avoid products’ ratings being affected by poor mid-/long- term reliability. And in both cases, I see no valid reason as to why the pages presenting the reviews won’t clearly state something like: “Beware, customers can only post review within X months/years after their purchase”.

Avoiding biases is hard, sometimes even impossible. I would even say that you can never get rid of all biases in a study. But the right attitude is to disclose them and discuss them. Not to conceal them and hope noone will notice.

Posted in reviews.


How to delete a commit that was already pushed to Gitlab

For the sake of simplicity, I’ll consider the case where you want to delete your latest commit. The command for this, locally, is git reset --soft HEAD~. If you want to delete older commits, you can use something like git rebase -i [commit ID right before the one you want to delete] (see more details there).

The problem I’ll focus on here is how to push your changes, now that you’ve removed something that was already pushed (and possibly added another commit on top). The command is trivial, it’s git push -f origin master (replace origin and master with the remote and branch names). But you may face a bunch of errors when trying to run it.

The first error I got was:

git@gitlab.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

I’m on Windows, and apparently Git is unable to use my SSH key, despite it being loaded in pageant (and as a matter of fact, usable when I push with SourceTree). The solution to this is to add the path to plink.exe in the GIT_SSH environment variable. You can set it permanently via the usual UI (or via Rapid Environment Editor), or you can set it temporarily, like I did, via the command line using set GIT_SSH=C:\PATH\TO\plink.exe. NB: don’t use quotes, or you’ll get something like error: cannot spawn "C:\PATH\TO\plink.exe": Invalid argument - error: cannot spawn "C:\PATH\TO\plink.exe": Permission denied. If you happen to have spaces in this path, maybe escaping them (slash-space “\ “) will work. Or just use the GUI. Or move plink to a folder without spaces.
If you’re on another OS, you might find a more appropriate fix for you in this thread.

The second error I got was remote: GitLab: You are not allowed to force push code to a protected branch on this project.
I had never protected a branch, but I quickly realized that Gitlab protects the master branch by default. You can unprotect it in Settings → Repository → Protected Branches (and once you’re done, maybe you’ll want to protect it back) (source).

And that’s all the trouble I had, git push -f origin master (or just git push -f if you just have one repo and one branch) should work now.

Posted in programming.


How to export a whole DynamoDB table (to S3)

For the most details, you’ll want to read the documentation, which provides a full section for this here. I’ll try to make this a lot more concise while still containing enough relevant details.

In your AWS console, go to https://console.aws.amazon.com/datapipeline/. Create a new pipeline. Set a meaningful name you like. In “Source”, select “Build using a template” and pick the template “Export DynamoDB table to S3”.

The “Parameters” section should be obvious: indicate which table to read, and which S3 bucket and subfolder you want the backup to be saved in. “DynamoDB read throughput ratio” is an interesting parameter: it allows you to configure the percentage of the provisioned capacity that the export job will consume. The default is 0.25 (25%), you may want to increase it a bit to speed up the export.

The “Schedule” section is useful if you want to run an export regularly, but if you don’t, pick “Run on pipeline activation”.

In “Pipeline Configuration”, I chose to disable logging. (note that every operation in itself is “free” but you’ll still have to pay any incurred costs, like the EC2 instance that will run the job and the S3 storage used by your backup, logs, etc)

In “Security/Access”, I just left IAM roles to “Default”. Not sure what the use cases are for this section.

You can add tags if you like, and click on “Edit in Architect” if you want to customize it more, but I’ll just click “Activate” here. It may tell you “Pipeline has activation warnings” (notably, there’s a warning if you disable logs), you can pick “Edit in Architect” to review the warnings, or just pick “Activate” again anyway.
If you do so, you’ll be redirected to your pipeline’s page, and it will start running shortly after (you may have to refresh the page, or go back to your pipelines list and again to the new pipeline’s page). The “WAITING_FOR_RUNNER” status will probably last almost 10 minutes before the job is actually “RUNNING”.

AWS Data Pipeline, job waiting for runner

Posted in web development.

Tagged with .


Getting Collabora Online to work in Nextcloud

Collabora Online is basically an open source Google Docs replacement with a very ugly UI and questionable performances. But it Just Worksℱ, and at least it doesn’t spy on you.
I helped set up a Nextcloud instance, and people there wanted Collabora Online in it. It was tougher than expected, and none of the instructions I found were exhaustive (although these ones are pretty complete), so here’s a recap.

Prerequisites:

  • A Linux server
  • Nextcloud up and running
  • Apache and some knowledge about configuring it (or knowing how to replicate what I’ll describe on your HTTP server of choice)
  • Let’s Encrypt (certbot) or knowing how to obtain a TLS certificate otherwise

First, use Docker. It’s theoretically possible to install Collabora the classic way with your package manager, but I just didn’t manage to get it to work this way.
apt-get install docker.io
Then
docker pull collabora/code
We’ll start it later. For now, you need to configure a dedicated subdomain, ideally with HTTPS.

In your Apache configuration, make sure the following modules are enabled: proxy, proxy_wstunnel, proxy_http, and ssl
Then add an HTTP virtual host (will be used to validate your TLS certificate with Let’s Encrypt) as follow (of course, adapt it with you domain and paths):

<VirtualHost *:80>
   ServerName nextcloud.example.com
   ServerAlias collabora.example.com
   DocumentRoot "/home/example/www"
   # RewriteEngine On
   # RewriteCond %{HTTPS} off
   # RewriteRule (.*) https://%{SERVER_NAME}/$1 [R,L] 
   <Directory "/home/example/www">
   Require all granted
   Options -Indexes
   AllowOverride All
   </Directory>
</VirtualHost>

and restart (or reload) Apache: /etc/init.d/apache2 restart

Note that I set up the HTTP virtual host to accept 2 subdomains at the same time in order to use it to validate a certificate for both Nextcloud and Collabora at once.
To obtain your certificate (via Let’s Encrypt, assuming it’s already installed):

certbot certonly --webroot -w /home/example/www/ -d nextcloud.example.com collabora.example.com

You can now add the proxy virtual host (again, adapt it with you domain and paths):

<VirtualHost collabora.example.com:443>
  ServerName collabora.example.com:443

  # SSL configuration, you may want to take the easy route instead and use Lets Encrypt!
  SSLEngine on
   SSLCertificateFile /etc/letsencrypt/live/nextcloud.example.com/cert.pem
   SSLCertificateKeyFile /etc/letsencrypt/live/nextcloud.example.com/privkey.pem
   SSLCertificateChainFile /etc/letsencrypt/live/nextcloud.example.com/fullchain.pem
  SSLProtocol             all -SSLv2 -SSLv3
  SSLCipherSuite ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS
  SSLHonorCipherOrder     on

  # Encoded slashes need to be allowed
  AllowEncodedSlashes NoDecode

  # Container uses a unique non-signed certificate
  SSLProxyEngine On
  SSLProxyVerify None
  SSLProxyCheckPeerCN Off
  SSLProxyCheckPeerName Off

  # keep the host
  ProxyPreserveHost On

  # static html, js, images, etc. served from loolwsd
  # loleaflet is the client part of LibreOffice Online
  ProxyPass           /loleaflet https://127.0.0.1:9980/loleaflet retry=0
  ProxyPassReverse    /loleaflet https://127.0.0.1:9980/loleaflet

  # WOPI discovery URL
  ProxyPass           /hosting/discovery https://127.0.0.1:9980/hosting/discovery retry=0
  ProxyPassReverse    /hosting/discovery https://127.0.0.1:9980/hosting/discovery

  # Main websocket
  ProxyPassMatch "/lool/(.*)/ws$" wss://127.0.0.1:9980/lool/$1/ws nocanon

  # Admin Console websocket
  ProxyPass   /lool/adminws wss://127.0.0.1:9980/lool/adminws

  # Download as, Fullscreen presentation and Image upload operations
  ProxyPass           /lool https://127.0.0.1:9980/lool
  ProxyPassReverse    /lool https://127.0.0.1:9980/lool
</VirtualHost>

And restart Apache again

Now, you should be good to start up the Collabora Docker container:
docker run -t -d -p 127.0.0.1:9980:9980 -e "domain=nextcloud\\.example\\.com" -e "server_name=collabora\\.example\\.com" --restart always --cap-add MKNOD collabora/code
Note that you need to indicate the Nextcloud domain in “domain”, not the Collabora one (the Collabora one can, optionally, be indicated in “server_name”: apparently it may help websocket to work better). If you don’t indicate the proper domain in “domain”, you’ll get an error saying “Unauthorized WOPI host”, somewhere in your Nextclound logs (FYI, they are in nextcloud/data/nextcloud.log).

You can now install the Collabora Online plugin in Nextcloud.
Then, in Settings → Asministation → Collabora Online, set Collabora Online server to https://collabora.example.com

I’ve most often had trouble getting Collabora Online to work directly this way: if after a few minutes, it still doesn’t work in Nextcloud, I’ve find it usually helpful to restart the docker container:
docker ps (to list containers and get the ID)
docker restart [container ID]

Posted in LibreOffice & OpenOffice, servers, software.


How DynamoDB counts read and write capacity units

I happen to use AWS DynamoDB at work (ikr), and one of the things that are way harder to grasp than they should is the way they count consumed read and write capacity. It is however pretty simple, once you manage to find the right pages (with an s) of their documentation. I’ll try to summarize it here:

Read capacity

A read capacity unit (RCU?) allows you one strongly consistent read per second, if your read is up to 4KB in size. If your read is larger than 4KB, it will consume more (always rounded up to the nearest 4KB multiple). If you use eventually consistent reads, it counts half. The default reading mode is the eventually consistent one.
If you get an item (< 4KB), it counts for one read (or half a read if using eventually consistent). If you get X items (each < 4KB), it counts for 1 read per item, no matter if you do X Get or 1 BatchGet (so I’m not sure how useful BatchGet is, compared to the code complexity it adds).
If you query items, only the total size matters.

If you “just” count items (eg, a query with Count: true and Select: 'COUNT'), you will still consume as much capacity as if you had returned all items.
Note that if your result set it larger than 1MB, it will be cut at 1MB. To read more than 1MB of data, you’ll have to perform multiple queries, with pagination.

Practical examples:
– Get a 6.5KB item + get a 1KB item = 3 reads (if strongly consistent) or 1.5 reads (if eventually consistent)
– Query 54 items for a total of 39KB = 10 reads (if strongly consistent) or 5 reads (if eventually consistent)
– Count 748 items that have a total size of 1.1MB = 250 reads (if strongly consistent) or 125 reads (if eventually consistent) for the first 1MB + another count query for the remaining 100KB.

Write capacity

A write capacity unit (WCU?) allows you one write per second, if your write is up to 1KB in size (yup, that’s not the same size as for the reads… how not confusing!). Multiple items or items larger than 1KB work just as for reads. Also, I don’t remember where I read that, but I’m pretty sure I remember seeing that delete operations count like writes and that update operations count like writes, with as reference the size of the larger version of the modified object.

Practical examples:
– Write a 1.5 KB item + write a 200 bytes item = 3 writes
– Delete a 2.9KB item = 3 writes
– Update a 1.7KB item with a new version that’s 2.1KB = 3 writes
– Update a 1.1KB item with a new version that’s 0.7KB = 2 writes

On a side note, I’m not really sure if DynamoDB uses 1KB = 1000 bytes or 1KB = 1024 bytes.

Burst capacity

At the moment (apparently it may change in the future), DynamoDB retains up to 300 seconds of unused read and write capacity. So, for instance, with a provision of 2 RCU, if you do nothing for 5 minutes, then you can perform 1200 Get operations at once (for items < 4KB and using eventually consistent reads).

Sources and more details

I tried to focus on the most important points about read and write units. You can find more details about this topic in particular and, of course, about DynamodDB in general, in the docs. Notably I used those pages a lot here:
AWS DynamoDB Documentation – Throughput Settings for Reads and Writes
AWS DynamoDB Documentation – Best Practices for Designing and Using Partition Keys Effectively
AWS DynamoDB Documentation – Working with Queries

Posted in web development.


Fixing letsencrypt’s “expected xxx.pem to be a symlink”

Apparently, last time I migrated my server, I messed up my Let’s Encrypt configuration. Or maybe Let’s Encrypt changed its way of storing it. Anyway, renewing my certificates failed with this error:

expected /etc/letsencrypt/live/notepad.patheticcockroach.com/cert.pem to be a symlink
Renewal configuration file /etc/letsencrypt/renewal/notepad.patheticcockroach.com.conf is broken. Skipping.

Obviously, a file was supposed to be a symlink and it wasn’t. Which is strange, because I migrated just like the previous times, and a migration never caused that issue before. Anyway, I found a suggested solution that said to turn said .pem file into a symlink manually. Sounds a bit hackish to me.

I chose to just reissue new certificates for the same domain name. But if you do so, you must clean up properly, otherwise you’ll end up with new paths to your certificates, something like /etc/letsencrypt/live/yourdomain.com-0001/cert.pem, which would require you to also update your HTTP server configuration.

To clean up:

rm -rf /etc/letsencrypt/{live,renewal,archive}/{yourdomain.com,yourdomain.com.conf}

(source)
(NB: watch out, you should probably make a backup before running this)

Then you should be able to get a new certificate, under the same file and folder names, with the usual command:

certbot certonly --webroot -w /home/www/path -d yourdomain.com

Posted in security, servers, web development.