• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

pxelinux + nfs root partition = no booting

Crusty

Lifer
I'm trying to setup a diskless Ubuntu install for an HTPC setup.

I've got dhcpd and tftpd-hpa setup correctly on my debian server, and the client boots off of the kernel just fine.

The client has 2 nic's in it. A 1gbps card and a 100mbps card. The gigabit is the one that is connected to the server through a switch. It is a completely different physical network then the 100mbps one. When the client boots, it gets it's IP from dhcp, 10.0.0.2. The NFS server is on the debian box, 10.0.0.3.

The drivers for the nic card load just fine, and it is eth0. From the boot messages I can see that it has the correct IP... however when it goes to mount / i get

VFS: Mounted root (nfs filesystem) readonly.
Freeing unused kernel memory: 256k freed
nfs: server 10.0.0.3 not responding, still trying
nfs: server 10.0.0.3 not responding, still trying
nfs: server 10.0.0.3 not responding, still trying

It hangs at this point, I can ping it on the network though.

I know NFS is working because I've got other exports from the same server mounted on a 3rd computer.

Syslog on the server shows an authenticated request for the mount as well.

I have tried specifying the ip settings in the APPEND string within pxelinux config files.. as well as just using dhcp.

Any ideas?
 
Any more errors than that? It's not a lot to go by... I've never done PXE netboots, just Sun stuff and a better OS.

It says it mounted /, maybe there is a misconfiguration for one of the other file systems or something.

EDIT: Can you boot into single user mode?
 
Originally posted by: n0cmonkey
Any more errors than that? It's not a lot to go by... I've never done PXE netboots, just Sun stuff and a better OS.

It says it mounted /, maybe there is a misconfiguration for one of the other file systems or something.

EDIT: Can you boot into single user mode?

I'll try, there is no other filesystem on the system. I'm still unsure of what to do for a swap partition. The system only has 512mb of ram in it.

edit: as far as error messages.. there are none on the server end of things.. and those are the only things I can on the client end. I'm looking for a serial cable to setup a serial console on this thing.
 
Originally posted by: Crusty
I'll try, there is no other filesystem on the system. I'm still unsure of what to do for a swap partition. The system only has 512mb of ram in it.

I think I used a swap file on my systems, but it's been a while since I did it. My systems had a LOT less than 512MB. 😛

Maybe there is an NFS debug option for the NFS server in Linux.
 
My guess is that it's trying the wrong ethernet card.

Depending on how the kernel is configured, how the modules are, the phase of the moon, what is eth0 one boot up under one system may or may not be eth0 when booted up the other way.

I'd personally just remove the second nic card for now until you get it working.

noc:
PXE boot actually is very kick-ass. The only bad part is that tftp services themselves can be very flaky. I've had identical hardware that would work on 2 or 3 machines, but the forth required having the tftpd service restart before it'd grab the kernel and initrd.
 
Originally posted by: drag
My guess is that it's trying the wrong ethernet card.

Depending on how the kernel is configured, how the modules are, the phase of the moon, what is eth0 one boot up under one system may or may not be eth0 when booted up the other way.

I'd personally just remove the second nic card for now until you get it working.

noc:
PXE boot actually is very kick-ass. The only bad part is that tftp services themselves can be very flaky. I've had identical hardware that would work on 2 or 3 machines, but the forth required having the tftpd service restart before it'd grab the kernel and initrd.

I've been meaning to try it, but I was spoiled with Sun's netbooting. :heart:
 
As far as I can tell it's not mounting /

server:/ubuntu/var/log# ls
auth.log base-config.timings btmp faillog lastlog syslog
base-config.log bootstrap.log dpkg.log ksymoops news wtmp


Where /ubuntu the is the / of the client. syslog is completely empty and there isn't even a dmesg.

I'll try taking out the other NIC and see what happens.
 
Same thing. eth0 is the gigabit card now, same as it was before. The server still shows the request for the authenticated request for mount of /ubuntu

I'm gonna turn up the log level of nfs and see what it says.
 
This is what I get when I restart the nfs-kernel-server service...

Jan 7 09:16:05 server mountd[26279]: Caught signal 15, un-registering and exiting.
Jan 7 09:16:05 server kernel: nfsd: last server has exited
Jan 7 09:16:05 server kernel: nfsd: unexporting all filesystems
Jan 7 09:16:05 server kernel: RPC: failed to contact portmap (errno -5).
 
root=/dev/nfs nfsroot=10.0.0.3:/ubuntu ip=dhcp

Those are my kernel parameters, and nfs_root is compiled into the kernel.
 
I don't understand how all of the sudden it would make a difference. I've been a hd install of debian on the client for a year or so using 3 different nfs shares. I don't think there is an issue with the server end of things, I can still mount all of the shares including the /ubuntu one across the network. I'll give it a try though, can't be too sure.

edit: Doesn't make a difference, still get the same error message.

One thing I noticed is that in the client boot it says

device=eth0, addr=10.0.0.2, mask=255.255.255.0,gw=10.0.0.2,
host=htpc, domain= , nis-domain=(none),
bootserver=255.255.255.255, rootserver=10.0.0.3, rootpath=
Looking up port of RPC 100003/2 on 10.0.0.3
e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duple
Looking up port of RPC 100005/1 on 10.0.0.3
VFS: Mounted root (nfs filesystem) readonly.
Freeing unused kernel memory: 256k freed
nfs: server 10.0.0.3 not responding, still trying
nfs: server 10.0.0.3 not responding, still trying
nfs: server 10.0.0.3 not responding, still trying


dhcpd.conf file is attached don't mind the domain name, heh.
 
I changed my dhcpd.conf file to what is attached(All I did was move option root-path "/ubuntu"😉 and now I get this...



device=eth0, addr=10.0.0.2, mask=255.255.255.0,gw=10.0.0.2,
host=htpc, domain= , nis-domain=(none),
bootserver=10.0.0.3, rootserver=10.0.0.3, rootpath=/ubuntu
Looking up port of RPC 100003/2 on 10.0.0.3
e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duple
Looking up port of RPC 100005/1 on 10.0.0.3
VFS: Mounted root (nfs filesystem) readonly.
Freeing unused kernel memory: 256k freed
nfs: server 10.0.0.3 not responding, still trying
nfs: server 10.0.0.3 not responding, still trying
nfs: server 10.0.0.3 not responding, still trying


But still no go 🙁
 
My point in disabling iptables was just to get rid of possible factors. Make the problem as simple as possible. Hell, replacing cables with known good ones wouldn't be a stupid step either.

Does the entry filename "pxelinux.0"; need a path?
 
Originally posted by: n0cmonkey
My point in disabling iptables was just to get rid of possible factors. Make the problem as simple as possible. Hell, replacing cables with known good ones wouldn't be a stupid step either.

Does the entry filename "pxelinux.0"; need a path?

No it does not. It is relative to the tftp root directory. That part only has to do with starting the boot loader and loading the kernel, which seems to work fine.

I've tried different cables and even a crossover cable. The only thing I can think of might be the MTU size. That network is using an 8k frame size. I don't know how to specify MTU in kernel-parameters or in the dhcpd.conf file... so i'll try moving the server to a 1500 frame size and see if that fixes things. Both nics and the switch support jumbo frames.. but we'll see.
 
Interesting... showmount -e gives me...

server:/# showmount -e
showmount: can't get address for server


Even though I have another computer with 3 exports mounted... i'm going to reboot the server and see if that fixes things.
 
Well with a reboot I get no init found... gonna pass init=/sbin/init now to the kernel.. hopefully that fixes it.
 
After sorting out some kernel/module issues everything is booting and working correctly now. As far as I can tell... the server just need a nice boot after I tweaked the configs a bit.
 
Back
Top