Here’s a bug in Windows explorer that I happened to notice today, which has zero value and despite me trying to find a way to abuse it, is seemingly completely useless. Tested on Windows 10 22H2 & Server 2019/2022.
If you right click on the start menu icon in the task bar, the task bar across all screens freezes in time. The clock doesn’t update, applications that open after you right click don’t show up on the task bar as being open (except in Server 2022), but when you click away from the right-click menu everything snaps back to the correct state.
Right clicking on other things in the task bar doesn’t yield the same results. It’s only the start menu icon.
Sims.Net is a very common and, in my opinion, a very badly written Management Information System used in the UK’s lower education sector – primary and secondary schools. It’s commonly used not because it’s good, but because it was one of the few options many years ago and a huge number of schools just haven’t bothered to move to something else. And I don’t blame them, it’s got a lot of history and data going back decades in that database. It’s not an easy thing to migrate off of.
ESS, the (current) caretakers of Sims.Net, have recently changed their licensing model to seemingly prevent third parties from hosting Sims.Net data on behalf of a school. I admit, I don’t know the details, but that’s the message as I understand it.
It looks like due to this, and the EOL of Server 2012R2, many school techs and sysadmins across the UK will be scrambling to get Sims.Net installed on an on-prem server.
Unfortunately, documentation is sparse. It’s a shame, but it costs a whole lot of cash to pay for third parties to do this for you. Right?
So if you’re looking to install or migrate Sims.Net, hop in and come for a ride. It’s easier than you think! Until you get to SOLUS3. We won’t be migrating that. But I’ll go through how to install it fresh!
I really appreciate well-written implementation guides for server-client software, but what really gets me excited is seeing migration guides for when you need to decommission that legacy OS and move onto something supported and current. Even the bad migration guides are great to have, but some that I’ve come across are works of art – numbered-list instructions that are clear and concise, incorrect assumptions busy sysadmins have during the migration are highlighted then corrected, screenshots where required, mid-migration tests to ensure things are going as smoothly as they seem… ahh, dreamy.
One of my personal pet peeves, however, is purchased server based software products that don’t come with any migration guidance or instructions.
I’m spending a lot of time at the moment moving niche (read: there’s only a couple of options and they all suck) and poorly made software from older server OS’s to newer ones, and whilst some of them are fine and can be figured out right away, most of them have… issues. Typically you can just install $dumpsterFire fresh to the new server, dump the existing database and throw it over the fence to the new server (yes, some of this terrible software rEqUiReS tHe DaTaBaSe Is oN tHe SaMe SeRvEr), maybe I’ll need to update some config files and point the new install at the existing database. One CNAME change later for the clients and you’re golden. Simple stuff.
Sometimes, however, you need to perform some forbidden magical incantations, and the kick to the teeth in many cases is that nobody in $org will write down what they are publicly. So you gotta pick the phone up and get some overworked and underpaid $tech at $org to walk you through the process, shooting down the constant stream of bugs and errors that occur along the way due to the shoddy quality of $dumpsterFire with bullets of solidified experience (that’ll no doubt be lost when $tech has had enough and leaves for greener pastures.)
Or, even worse than that… the $org requires payment for a migration because migrations aren’t considered “support” and are “optional”. Yes, it is support. And no, it isn’t optional.
Dear organisations that explicitly hide their migration guides to force an already-paying customer to pay you yet again to migrate your horrible software from one server to another (immoral), that INSIST that you “must TeamViewer in to do this” (in business hours only!?), that there’s no possible way anyone else can do it (false and stupid), that require DA/root accounts (you don’t), that have to be installed on a Domain Controller (I’m crying), that MUST have unmitigated 24/7 access by installing un-licensed teamviewer (Oh no you won’t)… There’s even one software solution here which “requires” a physical server. In 2023. The software won’t work in a virtual machine. (Oh, wait, spoiler: it works fine in a VM and has been working fine for over a decade)
To those organisations I say: fuck you. Do better.
Because I’m writing this all down, I’m recording my screen when you connect to solve unhelpful errors in your hidden log files, I’m fixing your stupid permission requirements and immediately uninstalling any additional or third party crap that isn’t a business requirement.
Back when I did end user support as my primary job I’d often ask myself “if there were no helpdesk tickets, what would I work on right now?”
The list I’d come up with was generally fairly short and changed each time I found myself pondering it, typically because the needs of the environment changed quite frequently, but it would also contain a few regulars – patch this, upgrade that, move that thing from there to there. Stuff that wouldn’t have an immediately beneficial impact but should probably happen. These are all good things to do and I encourage them to be part of the daily workload of a team.
The dynamic entries in the list were always something to do with making something better, getting rid of an annoyance or frustration. These are the things that do actually have immediate ROI and I would argue they should have time spent on them even during high ticket load periods but often find themselves being ignored because… Well, it ain’t broken. It’s just not great.
The issue is, as hinted at, this list changes often. Sure you can keep notes, an ideas board or submit a ticket, but when you do finally get an hour to look at the list, what’s on it doesn’t seem that critical.
I just read an older post on rachelbythebay.com that summarised this and explains what it means in five words:
Read it (and whilst you’re there, if you don’t already consume that content take a look at some other posts – very valuable information and opinions contained within)
Essentially, find the things people have whipped up a quick workaround for and are using and fighting with now, whether that’s within your team or another. Spend some time making that thing better, or resolve the problem at source if possible. You’ll not only make that person or team happy, you’ll also actively solve a now problem that will probably directly impact success, however that’s measured.
I recently read a blog post about a blog post and came to realise that I too struggle to post stuff on this very blog. I am conscious that I suffer from the curse of “I must make everything interesting!” but… maybe I don’t have to suffer any more. Maybe I don’t have to care. After all, if you don’t like it, you don’t have to read it.
I started this corner of the web to document my renovation and tech stuff for me and myself only. And Ben. (Hi Ben) But somewhere along the way it started to be about what other people find interesting, and… nothing really felt big or important or interesting enough. Hence the lack of posts recently.
So, a reset. Time to post more. They may not fit the predetermined categories, they may be completely uninteresting, but it turns out that not only is that okay, it’s what I intended originally anyway. They’re interesting to me, and besides I suspect I read these posts more than anyone else. Maybe, just maybe, someone else will find something a little bit interesting or useful too? That’ll be a positive bonus.
The algorithms and view counts hold no sway. I forbid thee!
The stuff below isn’t new. In fact, in the linked article is a link to a reddit thread where someone outlines this exact problem. But I feel that it can’t hurt to reiterate. And explore!
So, for the obligatory warning: Don’t paste anything on this page into a Powershell window. Don’t paste it into anything but a text editor. The examples below shouldn’t be harmful but… look, just don’t risk it, okay?
Malicious String – copy and paste the below example into a text editor (NOT a Powershell window)
I’ve said previously that things come in threes. This particular month in 2020 gave us three quite severe issues in the server room at work.
One – as the tide comes rolling in
We had some heavy rain in early October over a weekend. I spent the majority of that time in front of the fireplace until work on Monday morning.
Before I even got through the office I was alerted to a flood in our server room. I rushed down there to find an absolute mess. We are unfortunately strapped for space and also use the server room to store our spare equipment and some peripherals and consumables. We lost quite a lot of stuff due to water damage however most of it was old so doesn’t hold much value. There were some switches that got soaked, and also my own personal Draytek I was going to use as a secondary VPN (ours wasn’t great at the time) during lockdown in case of emergencies.
You can see in the following image a “tide line” around the walls of the room – we had about an inch of water in there that slowly soaked away. A phenomenal amount of water given the size of the room and the fact that there are sizeable gaps under the two doors that allow water to escape to neighbouring rooms
Somehow our servers didn’t get too wet. Humidity was at 90%+ for quite a while though, and we have experienced several hard drive failures since which were likely related.
After three days of drying, ripping up carpet and sorting through our stuff we could finally get back to the normal slog. However we learned that the fitted Aircon unit in the room doesn’t have a cut off for drying the air. It’s either on or off, and only when manually set. We can’t set a target humidity. In a server room you typically want to hover at about 50% – too high and the moisture in the air corrodes internals, too low and static electricity can more easily discharge in the dry air. Before ripping out the carpet and other detritus we’d hover at about 50-70% without any flood water in the room, but since then we’ve gone as low as 30%.
We need proper climate control and monitoring. And for the leak to be fixed… It has been an issue for a while which involves the design of the roof – water does drain away but if there’s too much too quickly, or the drainage is blocked as was the case this time, the water overflows and somehow finds it’s way to the ground through the second path of least resistance… Which just so happens to pass right through the server room about two feet in front of the cabinets.
Two – splashback
About two weeks after the first incident we had more heavy rain. More water in the server room. Luckily not as much water and because the room was empty and carpetless it dried out within a day. Unfortunately it came in through a slightly different part of the ceiling about half a foot closer to the server rack.
I’d love to get up on to the roof to try and figure out where this water is getting in. I’m told the fix is “new roof” but wonder if making a path for the water to escape by which doesn’t pass through one of the most expensive rooms on the campus is feasible. Not a fix, certainly, but a workaround until we can move the servers (which we should be moving before the end of this year.)
Three – this is nuts
Finally (at least I hope) is this, which happened about a week after flood #2:
We saw him on the camera we set up after the first flood. We raced down there and eventually managed to coerce it out from its temporary home directly under the server rack before it could eat through anything. There have since been more squirrels find their way into the building but none have yet braved the server room. For fear of drowing, I suspect.
Although management doesn’t particularly care about incident management and response reports, I find the whole process fascinating. So I wrote up an incident report for the squirrel invasion and it was quite entertaining to type out to say the least. So many puns.
TL;DR: Try disabling “SNMP Status” for the port on the Windows print server in Print Management. No guarantee this will work everywhere but it appears to have worked for me. Give it a go, let me know!
Printer says… what, exactly?
Feel free to skip the rest of this post. It’s just the sequence of events that took me to the supposed conclusion/solution.
I arrived at a customer site to find a stack of issues, but one in particular caused some confusion to my smooth-brained self. A HP Office Jet Pro 8710 printer, which worked up until last Wednesday, had a black screen with an error on it stating:
There is a problem with the printer. Turn the printer off, then on.
I of course tried a reboot. The printer switched on, booted up and appeared to be fine. But after 5 to 10 seconds it would make a clicking noise, the screen would flash blue, then go black with the same error. I hate printers.
It looked like there was some white text on the blue screen so I recorded it with the phone in slow motion and picked up an error code: b8bb2b3e
Googling this took me to one relevant result, a post on the HP forum with no replies. Great. I hate printers (I may say this several times) and I hate an error code with no official documentation anywhere on the internet. So, what is one to do in such a situation?
Well, poke about, push buttons, and hope you figure out a pattern, obviously!
My immediate thought was something PrintNightmare related, however as the post on the HP forum was made in April I moved that away from the forefront of my mind for now and started looking elsewhere. As luck would have it, I noticed that the error occurred just as the Wireless light finished flashing (because yes, this printer is connected to a domain network wirelessly. Literally the definition of evil.)
I plugged in a network cable that I pinched from a nearby desktop and the printer stopped crashing. Great, progress! This building had a new wireless network installed a couple of months ago so I began poking at that but nothing had changed in the last couple of weeks. Instead, I moved across to the Print server to change the ports over to the new IP address the device has from the wired network assuming it was wireless related and… pop! Blue screen with b8bb2b3e again. My head flipped back around to PrintNightmare patch issues or something up with the server. I didn’t really have a direction of travel from this point as Event Viewer had nothing of interest in it as far as I could see, but I decided to send the printer a Test Print just before it connected. I’ve seen before issues with printers where it’ll process an existing queue of work and then crash out or error, indicating that the core technology (network, server, printer) is functioning but some feature is causing an issue, and that’s exactly what I saw here. The device connected to the wired network, sent out the test print page, then immediately crashed out again. Historically I’ve fixed this (rare) occurrence by removing and re-adding the printer to the print server but I decided to poke around some more and try to nail down which setting was causing this (if, indeed, any at all.)
Luckily it didn’t take too many attempts to switch off SNMP Status Enabled, reboot the printer, and not see any crashes. The printer is still working a few hours later (back on the wireless… grr. We’ll run some cable into this particular office soon) and I have since checked Windows Updates installation date – the printer stopped working right after updates were installed on the server. The server was up to date prior to this month’s (July 2021) patch release so it does look like something in the most recent set of updates caused this. And yes, I have ensured the server is not vulnerable to PrintNightmare by checking for those registry keys, so it’s not that.
If I unveil the exact cause I will update this post.
The authentication… “bypass”… utilised as a first step: D’oh! How did something like that even get into production? The linked article has more details but essentially, if all authentication checks fail (when querying this particular file, not generally) instead of saying “Nope, you are not authenticated!” they instead say “Oh, you don’t supply a password that we can verify? Ok, let’s give you authenticated status anyway👍”.
Logic failures like this generally don’t happen in the ideal world because they’re blindingly obvious, so allow me to speculate for the rest of this paragraph. I can only assume that a developer temporarily set this up to diagnose a bug or test a feature and simply forgot to flip it back to “fail by default”. This is why peer review is important, though the rush to get things out to production works against this. It’s too easy to miss this kind of thing in the modern world. It shouldn’t be, but it is. There should be no blame here on any individual – I suspect a process/procedure needs to be looked at, or a team needs to be better resourced.