Crazy It Guy
Tuesday, October 8, 2024
Wednesday, July 3, 2024
Fixed it, up & running!
The sweet 'smell' of succes.
Finally just caved and got a 'new' motherboard (new to me). Put my two E5-2630L v4 (ES) processors in. Works like a charm,
Wednesday, September 6, 2023
ChatGPT actually gave some useful suggestions!
ChatGPT actually gave some useful suggestions!
Thought I'd share them here.
Physical impacts can cause a range of issues, from visible damage to subtle problems that only manifest under specific conditions. Here's a systematic approach to surveying the motherboard for damage:
1. **Visual Inspection**:
- **Magnifying Glass or Loupe**: Use a magnifying glass or a loupe to inspect the motherboard closely. Look for any visible cracks, broken traces, or damaged components.
- **Check Connectors**: Ensure that all connectors, including the PCIe slots, are intact and not bent or broken.
- **Inspect Solder Joints**: Look for any solder joints that might have cracked or become loose due to the impact.
- **Check for Debris**: Ensure there's no debris or foreign objects lodged between components or traces.
2. **Physical Inspection**:
- **Press Gently**: Gently press down on various parts of the motherboard to see if there's any unusual movement, which could indicate a broken stand-off or mounting point.
- **Check the Back**: Turn the motherboard over and inspect the back for any signs of damage or unusual marks.
3. **Functional Testing**:
- **Power On**: If the visual and physical inspections don't reveal any issues, consider powering on the server. Listen for any unusual sounds and observe any irregular LED patterns.
- **BIOS/UEFI Check**: Access the BIOS/UEFI and check for any hardware errors or warnings. Some systems run a self-test during startup and might report issues.
- **PCIe Slot Test**: Since the impact dislodged a card from the PCIe slot, test that specific slot with a known working card to ensure it's still functional. Also, test the previously dislodged card in another slot or system to ensure the card itself isn't damaged.
- **Stress Testing**: Use software tools to stress-test the system. Tools like Prime95, Memtest86, and FurMark can help identify issues with the CPU, memory, and GPU, respectively.
4. **Check Peripheral Components**:
- **RAM**: Ensure that all RAM modules are seated correctly. Consider running a memory test to ensure they're functioning correctly.
- **Storage**: Check that all storage devices (HDDs, SSDs) are securely connected and are recognized by the system.
- **Cooling**: Ensure that all fans are operational and that heatsinks are securely attached.
5. **External Inspection**:
- **Case and Chassis**: Check the server case and chassis for any signs of deformation or damage. Ensure that the motherboard is securely and correctly mounted within the case.
6. **Seek Expert Assistance**:
- If you're unsure about any findings or if the server behaves erratically, consider seeking assistance from a professional or a specialized repair service. They might have specialized equipment, like multimeters or oscilloscopes, to diagnose issues more accurately.
7. **Document Everything**:
- Take photos of any visible damage or irregularities. This can be useful for insurance claims, future reference, or if you decide to seek professional assistance.
Remember, even if the server appears to function normally after the impact, there could be latent issues that might manifest later. It's essential to be thorough in your inspection and testing to ensure the server's reliability and longevity.
Tuesday, September 5, 2023
ML350 G9, the continuing saga.
ML350 G9, the continuing saga.
Part I: received the server, box was pretty 'bashed up'.
The case was pretty 'bashed up', it had had a hard impact into the power-supplies (probably used to rest the case on the ground, by the delivery guys).
I've since replaced the power-board too, no luck so far. The same error keeps popping up. It's about an EFUSE (20h), but I have no idea where that is, I suspect it might be protecting the PCIe slots (maybe some of the pins have shorted?) but I have no idea where to look.
So: Motherboard first, then some 'flex' power supplies. Let's see where this goes.
And my own post on Reddit describing my 'pains' with the server board: https://www.reddit.com/r/homelab/comments/168o7ib/help_me_resurrect_my_ml350_g9/
Monday, August 21, 2023
ML350G9, the 'final' server.
Finally, have found my 'dream' server.
HPE ML350G9, capable of carrying 6 modules containing 4x3.5" or 8x3.5" drives. So 24xLFF or 48xSFF drives. I'm using zfs with SLOG/ZIL for data security. I have some 5x 8Tbyte QVO Samsung disks.
It can take 2x E5 v4 processors. In my case this would be 2x 2630L v4 10 core processors at 55W TDP. For now I intend to add:
- 128Gbytes of LRDIMM ECC memory (power saving and error correcting)
- 8/16 Gbytes of NVDIMM for the zfs SLOG/ZIL
- 2 port 10Gbit SFP+ ethernet card PCIe x8
- 2 port 56Gbit QSFP+ infiniband card PCIe x8
- Fujitsu IB mode SAS 12G card PCIe x8
- PCIe switch card with 4x 1Tbyte NVMe storage (cache) PCIe x8
- PCIe2NVMe card for the boot drive (2tbyte NVMe), hoping I can boot from it. PCIe x4
- SAS Expander (12G) (no PCIe lanes, just power)
- Nvidia Quadro P2000 (for transcoding) multiple streams possible. No external supply needed, limit 75W. PCIe x16
Tuesday, June 20, 2023
The challenge of charging Tesla packs.
The problem:
It's been a bit of a 'pain' to charge 16 packs to the same voltage. Once I got to the 16th pack, the 1st one was out of whack again.
The solution:
Friday, April 14, 2023
The storage 'endgame'.
I've been 'playing' with my NAS options. So far:
- The original Xeon E5-2630L v4 NAS. 200W, multiple 10G interfaces (built in 2017/2018)
- Zimaboard - 2x 1Gbit Network ports, 2 SATA ports, PCIe x4
- TMM Lenovo Thinkcentre M900 1xSATA, 2xNVMe PCIe x16
- Topton NAS board, 6xSATA, 2xNVMe, 4x 2.5Gbit ethernet
- Framework laptop motherboard with a+e 2230 to m.2 2280 converter and 2280 m2. 8x SATA ports: https://nl.aliexpress.com/item/1005004417694518.html