Info Fast ZFS Storageserver with Oracle Solaris, OmniOS and napp-it

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

thecoolnessrune

Diamond Member
Jun 8, 2005
9,664
558
126
Hey @gea when napp-it makes this change in stable, will mini_httpd be removed / cleaned up on upgrade when Apache is installed? Or simply no longer referenced?
 

gea

Member
Aug 3, 2014
188
11
81
Apache will be in napp-it 22.06 and used per default.
mini_httpd can be started only on demand: "/etc/init.d/napp-it mini"
 
  • Like
Reactions: thecoolnessrune

gea

Member
Aug 3, 2014
188
11
81

gea

Member
Aug 3, 2014
188
11
81
The multihreadad Solaris/ZFS integrated SMB server allows file/folderbased ntfs alike ACL but also sharebased ACL. These are ACLs on a share controlfile /pool/filesystem/.zfs/shares/filesystem. This file is created when you activate a share and deleted when you disable a share. Share ACL are therefor not persistent.

In current napp-it 22.dev share ACL are preserved as ZFS properties. When you re-enable a share you can now restore last share ACL or set basic settings like everyone full, modify or read.
 

gea

Member
Aug 3, 2014
188
11
81
Save energy on your napp-it backupserver

Energy costs have multiplied since last year. Really a problem for a backupserver up 24/7 when you only want to backup your storageserver once or a few times a day especially as incremental ongoing ZFS replications are finished within minutes.

A money saving solution is to remotely power up the backupserver via ipmi, sync the filesystema via ZFS replication and power off the backupserver when replications are finished. For this I have created a script for a napp-it 'other job' for your storageserver to simplify and automate this.

Details, see https://forums.servethehome.com/ind...laris-news-tips-and-tricks.38240/#post-357328
 

gea

Member
Aug 3, 2014
188
11
81
OpenIndiana Hipster 2022.10 is here


OpenIndiana is a Illumos distribution and more or less the successor of OpenSolaris. It comes in a desktop edition with a Mate GUI, browser, email and Office apps, a textedition similar to OmniOS bloody and a minimal distribution. Usually you install the desktop or text edition. Minimal lacks essential tools.

While OpenIndiana Hipster is ongoing Illumos (every pgk update gives you newest Illumos so quite a Illumos reference installation) there are annual snapshots that gives a tested startpoint for beginners. This is the main difference to OmniOS where stability with dedicated stable repositories is the main concern.

During setup, select your keyboard but keep language=en when using napp-it.
 

gea

Member
Aug 3, 2014
188
11
81
How to run a Solaris based ZFS server
Set and forget is a bad idea

 
Last edited:

gea

Member
Aug 3, 2014
188
11
81
Due a user request I have implemented filesystem monitoring,

I have included fswatch for Illumos (OmniOS, OpenIndiana) and Solaris.
On Solaris I had to modify sources, not sure about stability, https://github.com/emcrisostomo/fswatch/issues/228
 

gea

Member
Aug 3, 2014
188
11
81
Manage ESXi via SOAP ex create/delete ESXi Snaps

soap.png


I came up with the idea of AiO (ESXi server with virtualized ZFS/NFS storage and VMs on NFS and pass-through storage hardware) around 2010. This was the first stable ZFS storage solution based on (Open)Solaris or a lightweight minimalistic OmniOS. Others copied the idea based on Free-BSD or Linux.

From the beginning ZFS snaps offered a huge advantage over ESXi snaps as they could be created/destroyed without delay and initial space consumption. Even thousands of snaps are possible while ESXi snaps are limited to a few shorttime ones. Combined with ZFS replication a high speed backup or copy/move of VMs is ultra easy. That said, there is a problem with ZFS snaps and VMs as the state of a ZFS snap is like a sudden powerloss. There is no guarantee that a VM in a ZFS snap is not corrupted.

In napp-it I included an ESXi hotsnap function to create a save ESXi snap prior the ZFS snap followed by a ESXi snap destroy. This includes an ESXi snap in every ZFS snap with hot memory state. After a VM restore from an ZFS snap you can go back to the safe ESXi snap. Works perfectly but handling is a little complicated as you need ssh access to access esxcli. Maybe you have asked yourself if there is no easier way and there is one vie the ESXi SOAP api similar to the ESXi web-ui.

Thomas just published a small interactive Perl script for easy ESXi web management via SOAP. It even works with ESXi free, see ESX / ESXi - Hilfethread


1. install (missing) Perl modules

perl -MCPAN -e shell
notest install Switch
notest install Net::SSLeay
notest install LWP
notest install LWP::protocol::https
notest install Data::Dumper
notest install YAML
exit;

complete list of needed modules:
Switch
LWP::UserAgent
HTTP::Request
HTTP::Cookies
Data:: Dumper
Term::ANSIColor
YAML
LIBSSL
Net::SSLeay
IO::Socket::SSL
IO::Socket::SSL::Utils
LWP::protocol::https


Howto:
Update napp-it to newest 23.dev where the script is included

example: list all datastores
perl /var/web-gui/data/napp-it/zfsos/_lib/scripts/soap/VMWare_SOAP.pl list_all_datastores --host 192.168.2.48 --user root --password 1234

Attached Datastores "63757dea-d2c65df0-3249-0025905dea0a"
Attached Datastores "192.168.2.203:/nvme/nfs"


example: list VMs:
perl /var/web-gui/data/napp-it/zfsos/_lib/scripts/soap/VMWare_SOAP.pl list_attached_vms --host 192.168.2.48 --user root --password 1234--mountpoint /nvme/nfs --mounthost 192.168.2.203

Attached VM ID "10" = "solaris11.4cbe"
Attached VM ID "11" = "w2019.125"
Attached VM ID "12" = "oi10.2022"
Attached VM ID "14" = "w11"
Attached VM ID "15" = "ventura"
Attached VM ID "16" = "danube"
Attached VM ID "9" = "omnios.dev.117"

example: create snap
perl /var/web-gui/data/napp-it/zfsos/_lib/scripts/soap/VMWare_SOAP.pl create_snapshot --host 192.168.2.48 --user root --password 1234--mountpoint /nvme/nfs --mounthost 192.168.2.203 --vm_id 9 --snapname latest --mem --no-quiesce --snapdesc latest

example: list (latest) snap
perl /var/web-gui/data/napp-it/zfsos/_lib/scripts/soap/VMWare_SOAP.pl list_snapshot --host 192.168.2.48 --user root --password 1234 --vm_id 9

I will use the script to work together with a normal autosnap job in a future napp-it. Up to then you can create a jobid.pre (ex 123456.pre) in /var/web-gui/_log/jobs/ with a script to create the ESXi snap and a jobid.post to destroy the ESXi snap after it was included in the ZFS snap.

update
I have added a SOAP menu in latest napp-it 23.dev

soap.png
 
Last edited:

gea

Member
Aug 3, 2014
188
11
81
How to duplicate a ZFS pool with ongoing replications?
New pool should be exact identical to old pool but for ex. larger or with a different vdev structure


You must care of the following:
1. Transfer ZFS filesystems with a "recursive" job setting
This includes all datasets in the transfer (sub filesystems, snaps, zvols)

2. A ZFS replication creates the new filesystem(s) below the destination ZFS filesystem
pool1 -> pool2 results in pool2/pool1

If you want an identical structure you must create (recursive) jobs for each 1st level filesystem ex
pool1/fs1 -> pool2 gives you pool2/fs1

If the new pool should be named like oldpool (ex pool1):
-destroy old pool pool1 after transfers are done, export pool2 and import as pool1

3. A replication transfers not all ZFS properties like compress or sync
Filesystem attibute like ACL are preserved.

If you want the same ZFS properties you must apply them after the replication.
A better way is to set them on the parent target filesystem ex pool2 prior replications.
They are then inherited to the new filesystems,

4. Some ZFS properties can only be set on creation time
ex upper/lowercase behaviour or character sets

If you use napp-it to create pools, settings are identical.

5. Ongoing Replication/backup jobs
You can continue old replication jobs if
- pool structure remains identical
- you have snappairs of former replications on both sides ex jobid_repli_source/target_nr_1037

If you want to recreate a replication job that continues incremental transfers
- recreate the job with same source/destination settings and the old jobid
Jobid is part of old snapnames

Or rerun an initial replication
Rename the old destination filesystems ex to filesystem.bak to preserve them in case of problems. Then rerun a replication (full transfer). After success, destroy the .bak filesystem. Next replications are then incremental again.
 

gea

Member
Aug 3, 2014
188
11
81
New feature in napp-it 23.dev (Apr 05):
ZFS autosnaps and ZFS replications of ESXi/NFS filesystems with embedded ESXi hot memory snaps.

If you want to backup running VMs on ESXi, you mostly use commercial tools like VEEAM that supports coalesce (stop a filesystem during backup) or can include ESXi hot memory state.

If you use ZFS to store VMs you can use ZFS snaps for versioning or to save and restore them either via a simple SMB/NFS copy, Windows previous versions or ZFS replication. This works well but only for VMs at down state during backup as a ZFS snap is like a sudden power off. There is no guarantee that a running VM becomes not corrupted in a ZFS snap. While ESXi can provide save snaps with coalesce or hot memory state, you cannot use them alone for a restore as they rely on the VM itself. A corrupt VM cannot be restored from ESXi snaps while you can restore a VM from ZFS snaps. As ESXi snaps are delta files they grow over time so you should under no circumstances use more than a few ESXi snaps for no longer than a few days.

So why not combine both. Unlimited ZFS snaps with the recovery options of ESXi snaps. This can be achieved if you create an ESXi snap prior the ZFS snap that then includes the ESXi snap. After the ZFS snap is done, the ESXi snap can be destroyed.

Napp-it 23.dev automates this


Howto setup:

- update napp-it to current 23.dev
- add the needed Perl modules to OmniOS,
see https://forums.servethehome.com/ind...laris-news-tips-and-tricks.38240/#post-367124
- Enter ESXi settings (ip, root, pw and NFS datastores) in napp-it menu System > ESXi > NFS datastore

-list autosnap or replication snaps in napp-it menu Jobs
Click on the jobid to enter settings, add the ip of the ESXI server
- run the autosnap or replication job
Each ZFS snap will then include an ESXi snap. As a VM is stopped for a few seconds run this at low usage times.
- click on replicate or snap in the line of the job to check log entries

Restore a VM in a running state:
- shutdown all VMs
- restore a single VM folder from a ZFS snap, either via SMB/NFS copy, Windows previous versions,
filesystem rollback or replication

ESXi will see the ESXi snaps after a vim-cmd vmsvc/reload vmid (Putty) or reboot
- power on a VM and restore the last ESXi snap. The VM is then at the state of backup time in power on state.


more,
 
Last edited:

gea

Member
Aug 3, 2014
188
11
81
The new minimalistic Solaris fork OmniOS long term stable r151046 (Unix) is out

Open-ZFS in its genuine Solaris environment
Perfect ZFS/OS integration with bootenvironments for troublefree up/downgrades and lowest resource needs for ZFS.

Opensource but with a commercial support contract option.
Regular often biweekly Security and bugfix updates are free.

A dedicated software repository per stable release
No sudden new features or unexpected behaviours, only security and bugfixes
Update is possible up from last r151038LTS. Switch repository, pkg update and reboot.

Fileservices like iSCSI/FC, kernelbased NFS and the multithreaded SMB server are part of the Solaris OS
with unique integration of Windows ntfs alike ACL and direct support of Windows SID for AD users to preserve permissions in backups, local Windows alike SMB groups and zero config ZFS snaps as Windows previous versions. Easy SMB config (no samba.cfg), just turn it on/off.
 

gea

Member
Aug 3, 2014
188
11
81
Critical Windows SMB security warning

In response to CVE-2022-38023, Microsoft is removing support for RPC Signing in the Netlogon server, instead requiring Sealing when establishing a 'secure channel'. More details can be found here: https://support.microsoft.com/en-us...22-38023-46ea3067-3989-4d40-963c-680fd9e8ee25 and here: https://msrc.microsoft.com/update-guide/vulnerability/CVE-2022-38023

Timeline
June, 13: signing remains possible but cannot disable sealing on Windows server
July, 11: sealing is enforced, no authentication without sealing

Action
Update at least every AD member device like Windows or AD members like OmniOS or SAMBA prior July 11 !!
For an Illumos/OmniOS OS/ZFS kernelbased SMB server as an AD member the sealing feature is under final approvement


Newest SAMBA suppports sealing
 
Last edited:

gea

Member
Aug 3, 2014
188
11
81
Security update OmniOS r151046e (2023-05-31)

Weekly release for w/c 29th of May 2023.
This is a non-reboot update

Security Fixes
Curl has been updated to version 8.1.2, fixing CVE-2023-28319, CVE-2023-28320, CVE-2023-28321, CVE-2023-28322.
OpenSSL has been updated to versions 1.1.1u and 3.0.9, fixing CVE-2023-2650. OpenSSL 1.0.2 has also been patched against this.
 

gea

Member
Aug 3, 2014
188
11
81
Security and feature update OmniOS r151046l LTS (2023-07-20)

Weekly release for w/c 17th of July 2023.
This update requires a reboot

Security Fixes
OpenSSH updated to version 9.3p2, fixing CVE-2023-38408.
The prgetsecflags() interface leaked a small (4 byte) portion of kernel stack memory - illumos 15788.
OpenJDK packages have been updated to 11.0.20+8 and 17.0.8+7.

Other Changes
Various improvements to the SMB idmap service have been backported:
illumos 14306
illumos 15556
illumos 15564 Most notably, it was previously possible to get flurries of log messages of the form
Can't get SID for ID=0 type=0 and this is now resolved.

The UUID generation library could produce invalid V4 UUIDs.
An issue with python header files that could cause some third party software to fail compilation has been resolved.
 
Last edited:

gea

Member
Aug 3, 2014
188
11
81
Critical security update OmniOS r151038dm (2023-07-25)

To update, run pkg update
To undo update: boot in former bootenvironment


Weekly release for w/c 24th of July 2023.
This update requires a reboot

Changes
AMD CPU microcode updated to 20230719, mitigating CVE-2023-20593 on some Zen2 processors.
Intel CPU microcode updated to 20230512, refer to Intel's release notes for details.


Actions you need to take:

If you are not running the affected AMD parts, then there is nothing you need to do.
If you are running the affected AMD parts then you will need to update the AMD microcode.

Note, only Zen 2 based products are impacted. These include AMD products
known as:

* AMD EPYC 7XX2 Rome (Family 17h, model 31h)
* AMD Threadripper 3000 series Castle Peak (Family 17h, model 31h)
* AMD Ryzen 3000 Series Matisse
* AMD Ryzen 4000 Series Renoir (family 17h, model 60h)
* AMD Ryzen 5000 Series Lucienne (family 17h, model 68h)
* AMD Ryzen 7020 Series Mendocino (Family 17h, model a0h)

We have pushed an initial commit which provides a microcode fix for this
issue for the following processor families:

* Family 17h, model 31h (Rome / Castle Peak)
* Family 17h, model a0h (Mendocino)

 

gea

Member
Aug 3, 2014
188
11
81
Multithreaded Solaris/Illumos SMB server with NFS4 ACL and Windows SID

Solaris Unix is the origin of ZFS and still offers the best ZFS integration into the OS with the lowest resource needs for ZFS. While on Linux ZFS is just another filesystem among many others, Sun developped OpenSolaris together and ontop of ZFS as the primary filesystem (Oracle Solaris 11.4 and Illumos/OI/OmniOS are descendents) with many advanced features like Drace, Service Management or Container based VMs. All of these features and ideas found their way into BSD or Linux without the deep integration of NFS or SMB into ZFS. Especially the multithreaded SMB server that is part of Solaris based systems is the most common reason to use this Unix.

The most important use case of storage is SMB, the file sharing protocol from MicroSoft that introduced superiour fine granular ACL permissions with inheritance into their ntfs filesystem and SMB shares. While traditional Linux/Unix permissions or Posix ACL only offer simple read/write/execute based on a user id like 101, ntfs ACL added additional permissions to create/extend files or folders, modify or read attributes or take ownersip based on a unique id like S-1-5-21-3623811015-3361044348-30300820-1013

ACL permissions
The kernelbased Solaris/Illumos SMB server is the only one that fully integrates NFS4 ACL (a superset of Windows ntfs ACL, Posix ACL and simple Unix permissions) with Windows SID (owner/user reference) as a ZFS attribute despite the Unix ZFS filesystem that normally only accepts Unix uid/gid as a user/owner referece. Main adventage is that you can move/restore a ZFS filesystem with all Windows permissions intact. When you use SAMBA instead that relies only on Unix uid/gid, you must use complicated id mappings to assign a Unix uid to a Windows SID that differ from server to server.


SMB groups
The kernelbased Solaris/Illumos SMB server is the only one that additionally offers local SMB groups. Unlike Unix groups Windows alike SMB groups allow groups in groups in ACL settings. The group id is a Windows SID just like a user id.

Windows previous versions
The kernelbased Solaris/Illumos SMB server is the only one with a strict relation of a ZFS filesystem and a share. This is important when you want to use ZFS snaps as Windows "Previous Versions". As ZFS snaps are assigned to a filesystem, it can be quite confusing when you use SAMBA instead. As SAMBA only sees datafolders and knows nothing about ZFS, you must carefully configure and organize your shares to have this working especially with nested ZFS filesystems while on Solaris "previous Versions" just works without any settings.

Setup
In general the Solaris/Illumos SMB server is much simpler to configure and setup than SAMBA that is an option on Solaris too. No smb.conf with server settings, just set smbshare of a ZFS filesystem to on. SMB server behaviours can be set or shown with the admin tool smbadm, https://docs.oracle.com/cd/E86824_01/html/E54764/smbadm-1m.html or are ZFS properties like aclmode, aclinherit or are NFS4 file or share ACL in general.

Multithreaded
While the singlethreaded SAMBA wants best singlecore CPU performance, the kernelbased SMB server is more optimized for multicore CPUs and mmany parallel requests.

Cons
The kernelbased SMB server has less options than SAMBA and it supports Windows AD member mode only (out of the box).
 

gea

Member
Aug 3, 2014
188
11
81
On the 6th of November 2023, the OmniOSce Association has released a new stable version
of OmniOS – The Open Source Enterprise Server OS. The release comes with many tool updates,
brand-new features and additional hardware support


Unlike Oracle Solaris with native ZFS, OmniOS stable is based on Open-ZFS with a dedicated software repository per release.
This means that a simple 'pkg update' gives the newest state of the installed OmniOS release.

To update to a newer release, you must switch the publisher setting to the newer release.
A 'pkg update' initiates then a release update.

An update to 151048 stable is possible from 151046 LTS.
To update an earlier release, you must update in steps over the LTS versions.

OmniOS 151044 stable is EoL with no further updates.
 

gea

Member
Aug 3, 2014
188
11
81
Real filebased data tiering on ZFS
I'm currently working on the topic of data tiering.

With the special vdev, ZFS offers a very intelligent approach for hybrid pools made up of large but slow disks and expensive and fast SSD/NVMe. The basic idea with ZFS is: Particularly performance-critical data is stored on the fast special vdev due to its physical data structure (small io, metadata, Dedup tables), all other data is stored on the slow pool vdev.

The main advantage is that you don't have to set anything or copy data between the fast and slow vdevs. Just set and forget. This is a perfect approach for use cases with a lot of small, volatile data from many users (e.g. university mail server) and provides significant performance.

But for a normal Office or VM server this is practically quite useless. The classic tiering approach of storing data specifically in the fast or slower part of the pool would be more suitable. There is no support for this in Open-ZFS, but it could certainly be achieved, see https://illumos.topicbox.com/groups...on-a-special-vdev-and-rule-based-data-tiering

I I'm planning a Pool > Tiering menu in napp-it to make this more convenient. Until then, everyone can try it out manually.

sbb.PNG
 

gea

Member
Aug 3, 2014
188
11
81
OmniOS r151048b (2023-11-15)

Weekly release for w/c 13th of November 2023.
This update requires a reboot
Security Fixes
  • Intel CPU microcode updated to 20231114, including a security update for INTEL-SA-00950.
  • AMD CPU microcode updated to 20231019.
Other Changes
  • The UUID of a bhyve VM was changing on every zone restart. For VMs usingcloud-init, this caused them to be considered as a new host on each coldboot.
To update from initial 151048, run 'pkg update' + reboot
To update from former OmniOS, update in steps over LTS versions
To downgrade, start a former boot environment (automatically created on updates)