Tuesday, April 07, 2026

Data Center: Working with NextCloud

 I won't claim this is in any way exhaustive. It was exhausting.

The problem begins like this:

I want to have a "cloud file service" that I can use, and others can use, and I can offer better pricing.

So overall it looks like NextCloud is the right tool. It really has a lot of features I want. Maybe the only thing missing is an S3 interface to the users.

But I'm telling you, this isn't nearly as easy to set up as I was expecting it to be. This little chronicle will describe most of what I went through, and where I ended up. Which is to say, seriously incomplete.

You can get nextcloud installed as part of the ubuntu installer. That is the best way to go. Trying to figure it out after the OS installs was too complicated. Do the easy thing.

First info: I was installing on an HP DL380 G9, with 80GB RAM, and a chunk of RAID space. I'm even remember where I started with that. but here this week it was approx 18TB, and a small RAID volume of 150GB for OS install on 10K SAS drives. Sounds easy enough, right? Two RAID volumes, one for OS, one for cloud data.

So I did the simple install, fixed up the IP addresses, fired up NextCloud, created the admin account, and began to push some files over. I tried to copy all the music library from my desktop machine, to this new "cloud" filespace, 130 GB of data. And it ran out of space pretty quick. So I wondered what was going on, and the answer was that the big RAID volume wasn't in use. 

So I tried to figure out how to make it be in use, and the NextCloud answer is that that is not possible after the initial setup. I think that is where I dropped the ball a year ago. Had other things to worry.

This week I needed to come back, but I'd forgotten about it being in that awkward unfinished state. 

Had reinstalled Ubuntu/NC some months back, it was sitting as-yet-unconfigured, so I did the initial stuff, and noticed that I could specify a data folder; don't remember seeing that previously. This is important. Tried to copy the music files up again, rand out of room at about 50GB. WTF? This machine has a huge RAID, what's the deal?

Well, there's that *little* RAID volume, and that's where NC was putting data: because I didn't tell it not to. And that RAID volume was 150GB total, and it has the OS on it, and whatever the hell else the installed does. AAARRRGGGHHH! OF course it bombed out at 50GB.

Well, I have more drives now, so I pulled those two out, and put a pair of 2T drives in (these two are in the back). Rebooted, reconfigured the RAID to a single group, and--whoops! two of the drives are not the matching size, gotta swap them too. A quick moment on that and now there are 14 2T drives in a RAID 60 group, and I can do a reinstall with a more consistent size. The HP RAID controller really wants all the drives in a group to be the same, and I had had a pair of 146GB drives in the back as their own group. (One thing you could do here is use a pair of SSDs for the OS, 250GB probably being the right size.)

Full OS reinstall with NC, fixed the static IPs again, retry with the copy the music files attempt.

Fails at 80GB. WTF! Well, it turns out, somewhat the same deal. The Ubu installer has made some assumptions I didn't know about regarding how it created file partitions. I do not know what it was thinking here, but linux is an annoyingly complex thing at this level. My unix understanding began 30 years ago but then there's a huge time-gap where I didn't touch it. So now I can go look at what happened, but i don't understand some things. But again Ubuntu has done some partitioning of this one huge RAID volume into pieces it likes that I don't really know the entire purpose of.

So I hunted around to try to understand a bunch of things. Various linux commands would tell me various things. 

mount would tell me that sda1 and sda2 are mounted, but sda3 is not. sda3 is the big remainder space. I don't understand how this is possible, as there's just the one RAID volume, but ok, I know how to format and mount partitions. But no, it's already mounted, and I can't see how or where.

Eventually I see enough indicators to tell me it has gotten some flavor of "lvm" partition that isn't very big (100GB), and approx 15GB is inaccessible. This is just not what I want.

So I went online for some more info, read a little about LVM, watched a couple videos, ok, yeah, this is interesting, I should learn about this, sounds like LVM will do interesting things I want to do, but this feels like not the time. I really do want to do some of the expand/add/replace routine, but I need to think about how and what. 

So back to the RAID drawing board. I am doing a new pair of RAID partitions, where the two drives in the back are going to be a RAID 1 group for the OS, and the other 12 are a RAID 60 again, that will be an entirely separate mount that I will use for NC from the very beginning.

So that is where I am. OS being installed now. Infernally slow, it took 2 hours earlier. I think the RAID controller is redoing the RAID parity initialization, which takes a day or two at this size.

What I'm hoping to end up with is that the 2T RAID 1 group has all the OS goings on and that the other 16T RAID group isn't even mounted, so that I can do that and tell NC to use that mount as data space. Will be finding out tomorrow how well that works.

Next after that is trying out an S3 server tool or two; pretty sure I need to be able to understand and do this. It doesn't *LOOK* all that complicated, but that has to be a whole nother machine so it doesn't interfere with the NC instance(s). For the moment, I need the behavior, without needing the more serious content management like replication.

And of course LVM is a whole bigger topic. I'd LIKE for there to be a gui tool for this somehow, where I only need to "draw a picture" of what I want to do. Of course I'd like that for network config too.

And now I've just realized that the other LAN-only machines I set up NC on are also improperly configured. Argh. Yep, here's a picture of some of it on one machine:


You can see right there that although sda3 is the big partition, most of it is not mounted at all.  So do I need to do this over again, or can I just expand that thing to fill all the unused space? Or do I make another partition that is lvm like "ubuntu-vg-ubuntu-lv" and join them? I guess we're going to find out; this machine is unused at the moment so it's easy enough to experiment, and it's LAN-only so I don't care about having to re-install the OS. I need to do some more reading.

This is a little big bogus right off the bat:


You can see that the existing single volume-group is 100GB and that freespace is 22TB. WTAF is THAT about?

Now I wonder how many other machines I have badly configured in this same way? And why is this the default?

I found this web-page helpful here:
https://packetpushers.net/blog/ubuntu-extend-your-default-lvm-space/

This is the before picture on the second machine:

This is the command sequence:



This is the after picture:




NextCloud didn't like being told a specific folder to work with. No idea what was going on there, so I am back to the re-install. I thought about trying to tell it up front how to create partitions, but that did the wrong thing.


This is the starting point on disks. Looks good, yes?


This appears to be going to set a LVM group and LV to 14.5T, which is the full disk. Problem: there are two LVM groups, and you can only modify the size of one of them, and that one turns out to not be mounted anywhere, and I'm not yet sure how to achieve that.

So I'm back to a re-install where I let it run all the defaults, and will patch it later.



Wednesday, February 04, 2026

Data Center--Set Up ProxMox Cluster from scratch

Set Up ProxMox Cluster from scratch


First we need to create the remote server that will be large filespace that we will need to mount more than once.


Install Ubuntu on a machine with a lot of space. Make a folder that is going to hold ISOs.


I.e., 


During boot time, make a disk group that is the boot RAID pair, and a second group that is RAID 5 or 6, whichever does the stripe/parity thing. 


Install UBU on the RAID group that is just the pair.


Once that is done, and fully updated, mount the larger RAID disk LUN as /mnt/storage


sudo mkfs -t ext4 /dev/sdX


Verify this by doing a mount.


Sudo mount -t ext4 /dev/sdX1 /mnt/storage


Diddle as needed to get a successful mount.


Create a folder to hold the ISOs.


sudo mkdir /mnt/storage/ProxMoxISO


Find the ISOs and 


Edit /etc/fstab to do the same thing on a permanent basis


/dev/sdX1 /mnt/storage default 0 1


Install PM 8.3+ on one regular 1U server. 


Create a cluster. This machine will be in it.


Now you want to add a separate server to hold things like the ISOs. Create a folder for this and put at least one ISO in it. 


ALT: have one machine on the same physical network that will host IVENTOY PXE BOOT SERVER.


This will need the PROXMOX ISO as well as any Ubuntu or other ISOs.


Can put IVentoy into a VM, which seems to work ok for separate machines. 



VM ISO storage folder is /vat/lib/vz/template/iso

Wednesday, January 28, 2026

Data Center file access for VMs

So a next in line problem to work is how to have a VM have access to a file-space of data that is larger than its own disk image.

First some background:

Virtual Machines (VMs) are created with a fixed amount of "disk space". This amount is whatever number you give it when you create the VM. In this era of giant disks, no reason to be cheap on this; I will discuss with the customer/user how much they want, but I don't expect to give anyone more than 1TB.

Servers with four disk drive slots can hold ~100TB of space. They can have max 20-24 VMs running without having trouble about CPU core utilization (where each VM is using one core; fewer VMs if they use more than one core). So in theory each VM could have ~5TB of its own space. Those are big VMs at that point.

If a user is needing more than that, they should be using a separate file-share hosted on another machine whose purpose is hosting and sharing files. This article is about that setup.

There are several ways to do this. I have not investigated them all. Several are better integrated into ProxMox (CEPH and ZFS for example). 

I did, ~2 years ago, have a Windows Server 2022 DC Edition in operation. That particular group of five machines had really weak passwords and got hacked after I put it online. I wiped those several machines and started over, but I didn't bring back the Windows share(s) I had created...that was a "lab cluster" so it didn't matter too much.

This time, however, I want to do this better, with greater controls. So Samba it is.

Step one is install Ubuntu as the host OS. Standard approaches apply, either a USB or a PXE server. I'm trying to only use PXE at work, with a USB at home so I don't have to have another server running at home.

---

Turns out Samba on linux is harder to set up and configure than it ought to be. What an annoyance. Samba would be fine for other situations, not this one, where I need the size-limit quotas.

Seems likely to be easier with Windows Server which has quotas built in. I never set them up last time, didn't care at that moment. Shares are advertised via CIFS.

Have to wonder how an iscsi box would do that as well. No, apparently not intrinsically. 

Linux Servers wanting to use a Windows CIFS Share need to install smbclient and cifs-utils. You will want to mount CIFS shares via fstab at least some of the time. Pretty sure I did that before.

---

Installing Windows

Weirdly, it turned out I had 8 4T drives and one 3T drive in the RAID, so I pulled the 3T out, recreated the RAID, and started the Windows install. It then proceeded to bitch about not being able to install onto that logical drive, wrong partition type, blah blah something. Did some online reading that didn't exactly give the answer about what to do. (That 3T in the previous RAID was doing a negative thing for me, reducing the max size on the user side.)

The REAL answer: Windows wants to install to an MBR partition, which max size 2T. I swapped out drives so that I could create a 2x2T RAID 1 Mirrored group, and one 7x4T RAID 5 group. The machine was damaged in transit before I received it, so I can only put 9 total drives into it, thus 2 + 7. That's good for now.

Windows then cleanly installs, so the problem is not these other weird things I read about, it's the drive capacity/size. Install Windows to a 2T bootable, and let the other one, which is bigger, be the CIFS share.

Windows Server is a pretty easy install. You have to run Update after that, but that's fine. You should automate that anyway.

---

I'll need to be careful about the firewall aspects of all this, in the gateway as well as individual machines. I mentioned getting hacked in 2024. I should have been more careful about controlling the gateway to block things unwanted. As well as forcing better passwords, maybe even 2FA.

Wednesday, January 21, 2026

Data Center: Using NextCloud

It is STAGGERING how many web-pages purport to tell you where the NextCloud files are and just don't.

If you install Ubuntu in mostly default mode, at one point there's an option to install some 3rd-party tools along the way, and one of the choices is NextCloud.

https://georgenswebsite.com/operating-systems/ubuntu-server.jpg


You can see NextCloud is the second choice. It installs into some folder, you aren't told what it is. Mostly you don't care.

Answer: /snap/nextcloud/current/

Why I went to look for this: I'm getting an error I don't understand, it says I need to look at the log files. Which I can't find. 

/snap/nextcloud/current/ has a folder called "logs" but that is empty. I did find an indication that maybe that just holds mysql errors.

Some online reading suggest the log files are in the "data" directory, but I can't find that either.

Apparently NC uses an Apache install as its server/interface. Normally you would find that in /var/www/ but no.

If you did a manual installation:

https://docs.nextcloud.com/server/stable/admin_manual/installation/installation_wizard.html

that says /var/www/ but the ubuntu/snap installation does something else...you get a bunch of defaults chosen behind the scenes, which are difficult to change later.

One thing I would like to do is occasionally add disk capacity, but that has looked difficult. Setting NC to use non-default file-space looks hard, at best.

Best I can tell is that the data folder is inside the Apache folder, and the logs I want are there. But as I write this I don't know where Apache is either. And "data" is def not in /snap/nextcloud/


The intent here appears to be that your NextCloud host machine is doing NC and *nothing else* are you don't want to mess with it.

So when you do the initial NC setup, and are giving the admin account name, you get this:

So the data will be in /var/snap/nextcloud/common/nextcloud/data/.

Logs are in /var/snap/nextcloud/current/logs/ -- You will have to sudo to look at this, and maybe chmod to "logs" in order to get into the folder to look at the files. The window with the error message was this:

which isn't especially helpful. Turns out it means the server has to name itself as a "trusted machine". You have to go edit 

/var/snap/nextcloud/current/nextcloud/config/config.php

to fix this, there's an entry for "trusted_domains" that needs IPs added. You probably want to be VERY specific about this.

Check it first with:

nextcloud.occ config:system:get trusted_domains

If you need to add one or more, do this:

nextcloud.occ config:system:set trusted_domains 2 --value=exampledomain.com

where exampledomain.com could be an IP address. You can have multiple, the "2" is just a sequence number. I tried changing this, reboot, etc. Didn't help, so it was simpler to just reinstall from scratch than to try any harder to figure it out.

What my problem had been: I changed the underlying IP address for the machine. NC has some security controls related to this, so that's harder than it ought to be. (I am also using ProxMox, that has some complicated IP aspects.)

You database info is this:



If you want to do external access into the database (mysql in this case) you should show and save that password, it's a random string. An inch farther down is the "install" button. I clicked it and it is taking a good while to finish up, a couple of minutes in the end.

This is all almost absurdly complicated. Don't be wanting to fiddle with it more than once, that's for sure.

Once that "Install" is done, you now have a data folder, and can go, on a command line, to that folder to verify that a Admin folder exists under the name you just gave.

I went ahead and installed "Recommended Apps" because I expect customers to want to use some of them.


I am hoping that "Talk" is like Zoom, but it might be more like Slack. Apparently it is.

https://nextcloud.com/talk/


I have read that you can change which folder holds all the "data space" for NC, away from the default. It too sounds difficult, where in fact it should be made easy to allow me to add space, for down the line when I can afford bigger disk drives.

The way to begin with this is to get a standard HP DL380 15xLFF 3.5" drive spaces. They aren't expensive. Put the full 15 of 4TB drives in there, config that at boot time as one large RAID 50 (or 60 if you have had disk troubles). Then install ubuntu server (using PXE boot installer if you followed that other blog), check true for install NextCloud. All the default things will behave the right way, give it the IP you want, all is good.

Now you are at the startup screen to make the admin account, and can begin creating accounts, etc.

Friday, January 16, 2026

Data Center: PXE Boot

If you want to create a Data Center you are going to be installing an operating system on a lot of computers, and you're going to want to do it pretty quickly. How do you accomplish that?

There are 4 ways to get it done:

1) buy the computer with the O/S already installed. that just means that someone else is doing the install, and that has a cost. Maybe if you create a master boot disk, and clone the disk onto other disks, you can take them to new machines. I actually did this about 10 years ago, for about a dozen machines in a single rack.

2) install from a DVD. Worked fine for older machines, but wasn't fast. And nowadays, machines don't come with internal DVD. I have a few that do, but they are Gen 6 and Gen 7. Probably never buying those machines again.

3) install from a thumb drive. Probably faster than a DVD, and all modern machines do have a couple USB ports, but now you're trying to keep track of thumb drives, and I can tell you from experience that that's harder than it should be. I have several thumb drives for ProxMox 9, and I still have trouble keeping track of them properly. They WORK ok, but they're small and hard to see.

4) Network boot. Aka PXE Boot. I think all my machines can do PXE boot, so I finally created a server for this, after one too many times of re-downloading ubuntu-25.iso for VMs.


So PXE is pretty nifty, and once working, is great. There are hard ways and easy ways to do it. I tried a hard way last year, it didn't work, and I dropped it for a while. Thought about writing my own really simple one.

Recently I found an easy way to do this, using a free tool called "iventoy", which you can pick up at 

https://github.com/ventoy/PXE/releases

There's a paid version too that will do more machines at the same time.

Iventoy is super-easy to use. Iventoy has two interfaces, one is a web-page so you can look at what it is doing and the other is a special service to deliver an ISO file to a computer wanting it.

PXE Boot is what you want to do to avoid DVDs and thumb drives. Network is faster, it's easy to use, you set it up once, it can deliver multiple different OSes because it's just a file tool. I think I have it configured with 4 or 5 different ones, for my several needs.

Download iventoy someplace you can use it, unpack it and it's ready to go. Download the ISOs you want, there's a subfolder inside the iventoy folder called /iso/ where you drop them. Run iventoy as root. Look at it from a nearby web-browser window, it will show you a little bit of info, including available ISOs. Click the green rectangle to turn it on.


Above is the IP management. You can see the IP Pool numbers I gave it. I'm probably not doing more than 2 at a time right now, I just need this to not bump into any other machines, and that range is an otherwise empty space.

These are the 3 ISOs I have in place that might install. There were a couple more last month.

Now fire up another computer you want to install on, at the right point either press F11 for Boot menu, or F12 to go straight to Network Boot.

In a minute or so, if it's going to work, some messages will scroll by and then you will get the text-based iventoy PXE OS-selection menu, all the ones in the iventoy iso subfolder. Pick one and let er rip! Very quickly you will be at whatever is the starting messages and then menu options for that ISO.

You are good to go!

What I have done is actually create a VM on my first ProxMox machine, and load iventoy into that VM. I have set that VM to autostart on boot, and iventoy is set to autostart via cron inside that VM. That way the PXE Boot Server is ready to go all the time; if the VM crashes it's a quick restart.

I made a shell file to deal with needing to CD

nano run-iventoy.sh

    #!/bin/bash

    cd /home/YOU/iventoy-1.0.21

    /home/YOU/iventoy-1.0.21/iventoy.sh -R start

chmod a+x run-iventoy.sh

CRON: 

sudo crontab -e

(go to the bottom of the file)

@reboot sleep 30; cd /home/YOU/iventoy-1.0.21;/home/YOU/run-iventoy.sh 

Then you need to copy over the .iso files you want available, into 

    /home/YOU/iventoy-1.0.21/iso

So finally this is an easy thing to have running. I'd known about it for years but never had a need. Now I have a VM of iventoy that self-restarts on boot. 

---

One last thing about this: you can also PXE Boot the VMs in ProxMox. There's a menu point when creating a VM where you have to tell it the "CD" source you want to use. If instead you hit the radio button for "No CD" it will PXE boot. So you can have one VM PXE Boot from another VM and only need to have the desired ISOs in that one place on that one VM...assuming of course that your networking is properly in place to do this.

There are probably going to be some limitations here that I haven't bumped into yet. Iventoy has a limit of 20 machines using it. I don't actually know what that means, because I hit it once for some reason and a restart wiped that out. So I think you can do 20 in parallel, for whatever you want. I have installed ProxMox itself as the host OS, and ubuntu 24/25 as VMs, on the same machine and different ones.

Iventoy VM seems like one of those things that should be a pre-packaged VM you can just download. I don't know if ProxMox can even do that.

Data Center: networking

This is a really complex topic. I had no idea how complex for quite a few years. Have had to learn a good bit of it over the last decade, and what I now know is that I don't know very much.

This will feel a bit disjoint for a while, I hope it gets better by the time I finish writing it.

My previous experience with networking was not on this scale, and that's where the complexity comes from.

I would like to have the entire Center be on IPv6, but at the moment that's not going to happen. My casual impression is that v6 continues to be not quite ready.

I need to make some pictures for this as well.

One important thing to note is that this whole problem is a large multi-machine problem, between network gateways, network switches, and servers. 

Step One is that you need to know how to configure a gateway that is outside internet for WAN (as opposed to corporate upstream), the pass-through mechanisms (aka NAT rules), how to configure (or not) network switches, and how to configure actual computers and virtual machines.


Gateways and NAT rules.

The gateway serves is security interface between the outside world (WAN) and inside world (LAN).

I have been using Ubiquiti devices for this, because they do a lot more than I need for Data Center behavior (i.e., wifi networks).

NAT rules are the mechanism for how WAN and LAN translate to each other. For outbound traffic that originates inside, default behavior already exists and works fine, until you want isolation so that two computers can't bump into each other for bad reasons. A pair of NAT rules, one "Source" and one "Destination", will allow traffic originating outside to talk to one specific machine inside. This works fine, and I have used it a number of times so far, in testing. (Not providing a tutorial on NAT rules here, the key info is that you use Src/Dest NAT rules--took me a good while to find that out, with an explanation.) NAT rules are used to bash IP addresses in network data packets, just make the numbers line up properly and you're good.

The gateway(s) I am using have fiber ports for WAN and LAN, as well as 8 RJ45 ports for LAN (and one for LAN but that's a 1G uplink port, so not good on large scale. I am planning for 10Gig fiber uplink WAN, and 1Gig LAN activity, until I find that inadequate.

On the LAN side you can create multiple subnetworks in the various LAN sub-ranges. I don't know if there's an upper limit; I expect there's a practical limit.

Going along with my DC philosophy of sharding everything the network gets sharded too. The Center as a whole has multiple incoming fibers, each fiber feeds one row, each row has 20 racks, each rack has 20 servers, each server could have 20 VMs. OK, that's 8000 VMs, probably over the practical limit. We'll see if we get there. The shard is probably at the rack level, with ~400 VMs.

400 VMs is a lot of activity at the Gateway level, bu not really a problem regarding the IP addressing, and very straightforward.

What is less straightforward is the networking definition about packet routing for security and efficiency.


Networking inside a server.

It took me months on a casual basis of reading and then weeks of heavy working creating and testing different configurations until I learned enough to do half of what I want.

What I want: VMs are effectively isolated from each other on the network, preventing security problems, and routing to be efficient after that when VMs are busy doing network I/O. 

The picture is like this:

I have servers, HP DL360/380, with multiple network ports on the back:

1) ILO, which is separate machine management. 1 port.

2) RJ-45 copper ethernet. 4 ports.

3) SFP/Gbic fiber ethernet. 2 ports.


I want the RJ45 ports used like this:

1) host cpu. single IP address

2) for half the VMs. effectively multiple IPs

3) for half the VMs. effectively multiple IPs

4) VM backup management. Single IP.

5) Fiber for big data

6) Fiber for big data

I do not yet have this, and there may well be a further breakdown that I need. It's certainly possible that #1 and #4 should be combined. #5 and #6 perhaps could be as well.

At the moment this is not what I have, because I don't actually have the above. What I have works at the moment, but doesn't have the isolation I really want.

Here's what I did that works:

RJ 45 port 1 is separate, as desired, will allow host access, has one IP, doesn't hit the VMs.

RJ 45 ports 2 and 3 are bonded (i.e., shared). That bond group supports a Bridge. The VMs use that bridge.

Relatively speaking, this configuration is pretty trivial, but learning exactly what it is and does took me longer than I wanted.

How it works:

Linux runtime kernel includes a Layer-2 network routing behavior. You can define various control concepts for this in huge detail, well beyond what I understand.

Because the OS/VM-manager I am using is ProxMox, the network definitions are in the file /etc/network/interfaces. You edit this with "nano", like this:

nano /etc/network/interfaces

ProxMox, upon installation, creates as your machine's network interface, one "bridge", with an IP you give it (possibly originating with DHCP). This bridge names which network port it will use (can be the fiber port), and the IP address. The host OS is what you reach on this IP (VMs will of course get their own IP or you will give a static one). VMs, as created, will default to using this bridge to get to the outside world (or other VMs if that is what you are doing). This is all the default and works fine, except that most of the network physical connections are going unused.

So an alternative would be to create a bond instead of a bridge, and tell the bond to use all the ports (or a subset, leaving the other ports unused). Using the other ports requires giving them their own IPs.

What I had hoped to do was create one IP (#1), two bridges (port #2 and port #3), where port #1 would be on one subnet, port 2 with be a separate subnet, port 3 would be a third subnet, port 4 a fourth IP , etc., such that 2 and 3 are supporting VM external I/O. That seemed to not work properly at all, two bridges seemed problematic; entirely possible I goofed up something else, I have some further experiments to do here.

Here's my final "interfaces" file content:


auto lo

iface lo inet loopback


auto nic0

iface nic0 inet static

        address 192.168.1.140/24

        gateway 192.168.1.1


auto nic1

iface nic1 inet manual


auto nic2

iface nic2 inet manual


iface nic3 inet manual


iface nic4 inet manual


iface nic5 inet manual


auto bond0

iface bond0 inet manual

        bond-slaves nic1 nic2

        bond-miimon 100

        bond-mode balance-tlb


auto vmbr0

iface vmbr0 inet static

        address 192.168.1.141/24

        gateway 192.168.1.1

        bridge-ports bond0

        bridge-stp off

        bridge-fd 0


source /etc/network/interfaces.d/*


You can see "nic0" (copper port 0) has its own IP. This is for host use. The "bond" has two "slaves", copper ethernet ports nic1 and nic2, "balance-tlb" means the sharing I want here (there are other options but they seemed wrong). The "bridge" uses the bonded pair of nic1 and nic2 for I/O; VMs will be created using this bridge.

It took me a while to understand enough to use this definition properly such that it worked the way I wanted. I had wanted two have two bridges, with one copper ethernet port for each. ProxMox will not allow you to create two bridges, and for different VMs to use either one, which WOULD work the way I wanted, but it aborts creating that because creating a second one tries to create the same "default gateway" over again. You can create a second bridge by editing the file, it works how you want after that, almost. This is where I want to do a little more experimentation that I didn't get, such that the gateway has more sub-networks defined, they connect to specific nic1 or nic2, have different IP address ranges, etc, achieving a greater measure of VM isolation for security.

I can't do the next stage of learning/experimentation without buying more network switches for the physical isolation I want, where ILO is one physical network, port 1 is one physical network, port 2 is another physical network, etc. This helps with security, and loading efficiency. 

The casual IP assignment right now is like this:

.1.20/24 = host machine N

.1.21/24 = bridge

.1.22-39 = VMs, with some extra cores

.1.40/24 = host machine N+1

etc. Numbered like this means approx ".1.X" is for ten host machines in the rack, ".2.X" is for the other ten.

I have the feeling I'm going to grow to dislike that, but for the moment I am able to remember it all properly, although I do have a chart on the wall

What I WANTED original probably looks like this, and I may have to do it sooner than later:

.1.16 network ID (/30)

.1.17 local gateway

.1.18 usable IP

.1.19 broadcast

.1.20 network ID (/30)

.1.21 local gateway

.1.22 usable IP

.1.23 broadcast

where each VM gets its own little micro subnet. This would require the gateway to really have a lot of subnetworks defined, in these four-pak groups. Again here I think we are bumping into practical limits, but I don't know. 

More work, and more writing, on the way.

---

Update Jan 31.

I realized a week later I had forgotten something about how my overall network is set up, and had done some wiring wrong.

My gateway has several LAN networks created, and I had forgotten that they were coupled to different ports on the gateway, so I things plugged in wrong, thus the failure that I incorrectly attributed to the "interfaces" file (or equiv).

So I bought a couple 24-port gigabit network switches (these things are dirt cheap on ebay), corrected the wiring so that the VMs are attached to a bridge that used the next port and has address on the next subnet. Crimped a couple new wires to help out.

Works! Yay! Finally! Now I have to go back to my IP numbering scheme and completely rework it. This should now be cleaner. 

Also, I need some more gigabit network switches. I need to get my 42U frame rack back, so I can do this better. Turns out that although I had designed to have network switches at the top of the rack, it seems they really should go in the middle to minimize cable lengths and have them not flapping around a bunch. Probably going to have to do tie-downs also, or wrapping. I don't like too much loose wire flapping. 

Need to come up with a color scheme, too. Black CAT5 wires are cheaper, but you can't tell them apart, and I will have trouble seeing them.

I will add a drawing here when I have it finished.

---

Update Feb 4.

So I don't remember reading this anywhere else...it turns out that the LAST THING in you "/etc/network/interfaces" file is the one that becomes the "default route".

So for the example with nic0, bond0, vmbr0, vmbr0 is the default route, not nic0 like I wanted. Moving the "nic0" definition to the end fixes the problem.

Friday, January 09, 2026

Data Center: Plug-in cards for HP DL360/380

 I don't know how many cards there are that could go in, these are just the ones that I have and have used.

There are two kinds of sockets in the back that I have experience with. Most are PCI, one is called Flexible LOM. Both 360 and 380 servers have the LOM socket, it actually direct-connects to the mobo. The PCI sockets are on riser cards; a 360 can 2 or 3 PCI cards and one LOM, a 380 can have 3, 6, or perhaps 8, and one LOM.

USB expansion: this PCI card has four USB sockets; one is internal only, maybe for an internal thumb drive? I don't have a server where I need extra USBs. Keyboard and mouse is all, except when I have to install OS from a thumb drive. I am now using PXE-boot so generally I don't even need the thumb, and I would go to the Terminal UI for that situation. Maybe an external DVD drive?

4X GbE network card: four standard gigabit ethernet ports. They work exactly like you think. This is a Flexible LOM card.

2X 10GbE network card: copper RJ-45 network ethernet ports. These too work like you expect except they are faster. This is a Flexible LOM card. the 10GbE you are going to want to use the special cables.

2x 10GbE fiber SFP network ethernet ports. This is a Flexible LOM card. Takes standard SFPs.

I expect there's a fiber-channel adapter LOM card, as well as PCI cards. Maybe Infiniband?


The machines only have one LOM socket. There are other PCI sockets. In a DL380 you could have 6 PCI sockets. Any of those can have a network card, but not one of the LOM cards. I have several 380s with six PCI slots.

The presence or absence of these cards has nothing to do with whether they are configured for action, that's a whole different thing.

I still don't know whether adding the fiber LOM card "turns off" the RJ45 ports. It has seemed to in the past, but that might have just been me not knowing what was supposed to happen. There was a time, for me, about a year ago when the ProxMox 8.0 installer didn't have proper device drivers for fiber, and I was trying to install on a machine that had no copper ethernet, just the Fiber LOM card. PM installed ok, but didn't pick up time, or DHCP, or any active networking. That would of course wanted a live network connection, which it didn't have, so it couldn't update itself to use fiber because it needed the fiber already working in order to do the update. StOOpid. A few months later a newer installer of PM had the fiber driver, so all was good.

There's a new thing I need to about operating on disk drives, requires a new flavor of SAS interface card similar to the RAID card, but I'm going to use an entirely separate/different computer for this. Assuming success occurs, will blog about that when the time comes.

Thursday, January 08, 2026

Data Center Server computers and their disk drives

512 vs 520

Learned this one the hard way, didn't understand some details before I spent some money. Fortunately only a little bit of money. One of the sellers tried to warn me, but I didn't know what he was saying, and I didn't understand the situation yet.

There are several features/details about disk drives you want to understand before you buy any and end up with units you cannot use.

I am only addressing the machines I am using, so others may be different. Those machines are:

HP DL360 Gen 6/7/8/9

HP DL380 Gen 7/8/9

All of these machines have internal RAID hardware. Sometimes that hardware is a removable daughter board, sometimes its directly on the mobo; I have both. I have 410, 420, 440 model/version numbers. 

Sometimes the RAID software interface is an old-school text interface, specifically Gen 6 and 7, with a mousable GUI for Gen 8/9 (and presumably 10/11).

Your first criteria of concern is the physical size, either 2.5 inch or 3.5 inch, and the quantity that the machine can mount.

For the machines that will take 3.5 inch: comes in both 1U and 2U. You will need matching caddy trays that are 3.5 inch and match the machine. Again, Gen 6/7 are different from Gen 8/9+. Finding the right screws is important too, and the screws for a 3.5 inch drive are different from a 2.5 inch. Quantities that can be installed are 4 in a 1U server, and 12 or 15 in a 2U; imagine having 15 28-TB drives in your machine, that's 400 terabytes. (I have seen pictures of another brand 2U server taking 24 drives!)

For the machines that will take 2.5 inch platter or SSD: also comes in both 1U and 2U, quantities vary, from 5 to 8 to 10 to 16 to 24. Platter drives in 2.5 inch are three rotation speeds: 7200, 10K, and 15K. The most common is the 10K. Capacity goes from 36GB to 72GB to 146 to 300, 450, 500, 600, 900, 1.2 TB, and I think there's a 1.8TB unit. I have a few of nearly all those sizes. For a data center you really want the units from 500 to 1.2T. Most all those are 10K, which is good.

The faster a platter spins the faster you can read data from it. So 10K is faster than 7200, and 15K is faster still. Speed correlates against capacity: there are no 10K or 15K 20TB drives.

SSDs are 5-10X faster right out of the box. You really want to use them as much as you can, esp given that in the same physical size you can get up to 8TB per unit. They are a good bit more expensive. EBay will sell you a 500GB 10K 2.5 inch platter drive for $20, a 900GB for $25. These are pretty good deals until you need a lot of space.

All these details you can read on the device label.

The thing you CAN'T read on the label is what the bytes-per-block/sector number is. There are several possibilities here, 512, 520, and I have one drive that is 4096. (Later: I discovered that I have some that are 4160, whatever THAT is)

My machines, the HP DL360/380s, will only take 512s. The 520 is an attempt to squeeze a teeny bit more data onto the disk, the 4096 is going to be a bit wasteful but that drive is 10TB so you probably don't ever notice that. These get used in other places like a SAN, which is all about capacity, or some form of JBOD.


512 vs 520 

Before you click Buy It Now make sure you know the answer to this, else you just bought a brick.

-------------

That said, it is my casual understanding from reading online that it is possible to convert one to the other, but the process sounds a little complicated, and it cannot be done on my machines.

What I have read of that conversion: first off, standard RAID on my machines cannot do it, and will barely even tell me about it. Install one, go to the Gen 8/9 gun-based Smart Storage Administrator, look at details on one of the physical drives, the drawer window on the right will say something like "can't be used for RAID" and will maybe also say "520". So you are going to have to take this drive to another machine that has a daughter card (or PCI) that is itself configured for HBA mode, not RAID, so that software will have more direct access to the drive, then some other software tools (ShredOS apparently) will allow you to turn a 520 into a 512. I have never even gotten close to trying this. One of these days, maybe...

Later: I have received a PCI card for a standard PC that is an HBA adapter that suppposedly will let me plug in a drive, talk to it directly, tell it to reformat to 512. Apparently that will take some hours or overnight for bigger drives. Also part of the picture: you boot something called ShredOS from a thumb drive, it uses the HBA card to allow talking to the disks. Story is you can do several at the same time.