TL;DR 1) put your dearth of expertise on view earlier rather than later
2) take care using a sledgehammer when opening soft-shell almonds
Now for 25 years VSM very generously has been home to WASD as well as a
development test-bed. Many WASD apps were initially deployed on its systems
and from there have gone out to conquer a very small corner of a very small
niche O/S -- including the wuCME app.
wuCME is a WASD server Certificate Management Environment for the Let's
Encrypt (LE) service.
https://wasd.vsm.com.au/wasd_root/src/wucme/readmore.html
ACME, the enabling protocol, employs cryptographically signed data exchanges
to establish identity and provide server certificates to sites automatically.
One method to establish ownership of a domain name is a challenge-response
HTTP request to that host.
wuCME -- and this is significant -- is based on a GPL licensed code base,
written in plain C, and "wrapped" with some more C code for use with WASD.
So I didn't write the algorithms employed nor really come to grips with the
internals. Thank you Nicola Di Lieto. It just worked. Until recently.
wuCME began refusing to renew a long-time LE VSM site.
**[edited for security and clarity]
|2024-03-14 02:44:15.54: version WUCME IA64-1.1.8 (1.1.2) (OpenSSL 3.0.7 9 Nov 2022) starting
|2024-03-14 02:44:15.54: loading key from /wasd_root/local/wucme_k_account.pem
|2024-03-14 02:44:15.59: loading key from /wasd_root/local/wucme_c_mail_vsm_com_au.pem
8< snip 8<
|2024-03-14 02:44:30.83: the server reported the following error:
|{
| "type": "urn:ietf:params:acme:error:connection",
| "detail": "During secondary validation: 119.252.17.13: Fetching \
http://mail.vsm.com.au/.well-known/acme-challenge/xEAHIHLix_: Error getting validation data",
| "status": 400
|}
|2024-03-14 02:44:30.83: failed to authorize order at \
https://acme-v02.api.letsencrypt.org/acme/order/17298823/252017381857
Wadya mean, "Error getting validation data"?
But ... but ... we see the challenge request and response on the server!!
||02:45:09.34 NET 2202 000015 CONNECT ACCEPTED outbound1b.letsencrypt.org,44517 on \
http://www.vsm.com.au,80 (119.252.17.13) BG44236:|
8< snip 8<
||02:45:09.34 SERVICE 1779 000002 CONNECT VIRTUAL mail.vsm.com.au:80|
||02:45:09.34 REQUEST 4442 000002 REQUEST GET /.well-known/acme-challenge/l6Zlh8< snip 8<|
8< snip 8<
||02:45:09.37 NET 2696 000002 RES-HEADER HEADER 275 bytes|
|HTTP/1.1 200 OK
|Server: HTTPd-WASD/12.2.0 OpenVMS/IA64 SSL
|Date: Thu, 14 Mar 2024 15:39:09 GMT
|Accept-Ranges: bytes
|Accept-Encoding: gzip, deflate
|Content-Length: 87
|Connection: close
|Cache-Control: no-cache, no-store, must-revalidate
|Pragma: no-cache
|Expires: 0
|
||02:45:09.37 NET 2839 000002 RES-BODY BODY 87 bytes|
|6C365A2D 6C68446A 65503830 6950655A 5F694B4E 56623773 64364C32 3457544A l6ZlhDjeP80iPeZ_iKNVb7sd6L24WTJ
8< snip 8<
|4A6D5A55 73744737 66666A42 76684E6D 645F472D 4C7338 JmZUstG7ffjBvhNmd_G-Ls8
Lots of Internet searching all pointed to DNS/firewall/related issues
blocking the challenge. But ... but ... we're seeing it!
Well, eventually registered for the https://community.letsencrypt.org/ forum
and posted the question there. In very short time I was asked,
"How many requests are you seeing? There should be at least three, and
they're working on adding even more."
(Hmmmm, three!?) "Just the one."
"Then you (or something upstream of you) has something blocking some of the
requests that Let's Encrypt is using to try to validate that you own the
name."
Seeing one, instead of three. And shortly five...
https://community.letsencrypt.org/t/lets-encrypt-is-adding-two-new-remote-perspectives-for-domain-validation/214123/2
LE very cleverly challenges a site from multiple geographically diverse
servers making it more difficult for attackers to hijack validation requests.
Well what can I now say? MY FAULT.
Two weeks before, soyMAIL and the site in general, seemed unusually sluggish
and looking at the server stats there were many tens of connections all
churning away as hard as they could. A ham-fisted DOS? Trawlers loose from
the moorings? Difficult to say. There was nothing distinguishing them. And
the user-agent did not identify as a bot (many of which I've excluded using
explicit mapping rules over the years).
So, not being identifiable, I took out my trusty sledgehammer (still shiny
new) and clobbered them. Instantly more responsive. No surprise there!
The rule:
#WASD_CONFIG_REJECT
*.amazonaws.*
and didn't think about it again -- until eating soup at a local cafe
(unnecessary detail but does reinforce the idea of non conscious problem
solving.)
Commenting out that rule and HTTPD/DO=REJECT
|13:18:40.27 NET 2202 000015 CONNECT ACCEPTED outbound1b.letsencrypt.org,43751 on
http://www.vsm.com.au,80 (119.252.17.13) BG44236:|
8< snip 8<
|13:18:40.87 NET 2202 000016 CONNECT ACCEPTED ec2-18-216-65-86.us-east-2.compute.amazonaws.com,41306 on
http://www.vsm.com.au,80 (119.252.17.13) BG44246:|
8< snip 8<
|13:18:41.58 NET 2202 000018 CONNECT ACCEPTED ec2-18-237-231-133.us-west-2.compute.amazonaws.com,34448 on
http://www.vsm.com.au,80 (119.252.17.13) BG44249:|
Three validations, from three unique servers, and a happily renewed
certificate.
Would have saved a couple of weeks head-scratching along with trial-and-error
solutions had I sought the same assistance earlier.
This item is one of a collection at
https://wasd.vsm.com.au/other/#occasional
|