Subscribe via RSS
18May/170

DEC 4000 AXP – Initialisation and OS Installation

I was finally able to see init screens and boot diagnostics after getting my newly acquired DEC 4000 AXP to talk over serial. PuTTY could be configured into DEC ANSI and displayed everything perfectly.

VMS PALcode V5.56A, OSF PALcode V1.45A
+------------------------------------------------------------------------------+
¦                      13:26:22  January 9, 2061                               ¦
¦                                                                              ¦
¦                       Digital Equipment Corporation                          ¦
¦                                                                              ¦
¦                             DEC 4000 AXP (tm)                                ¦
¦                                                                              ¦
¦                   +------------------------------------+                     ¦
¦                   ¦ | Executing Power Up Diagnostics   ¦                     ¦
¦                   +------------------------------------+                     ¦
¦                                                                              ¦
¦                                                                              ¦
¦                                                                              ¦
¦                CPU   Memory     Storage    Net    Futurebus+                 ¦
¦                0 1   0 1 2 3   A B C D E   0 1   1 2 3 4 5 6                 ¦
¦              +-----------------------------------------------+               ¦
¦              ¦ P -   - - P P   ? ? ? ? ?   ? ?   ? ? ? ? ? ? ¦               ¦
¦              +-----------------------------------------------+               ¦
¦                                                                              ¦
+------------------------------------------------------------------------------¦
¦ *  Test in progress   P  Pass   F  Fail   -  Not Present   ?  Sizing         ¦
+------------------------------------------------------------------------------+

Diagnostic Name        ID             Device  Pass  Test  Hard/Soft   9-JAN-2061
io_test          0000003e       scsi_low_con     1     1     0    1     13:26:29
Expected value:fc  Received value: fffffffd
Failing addr:    0

*** End of Error ***

*** Soft Error - Error #11 - TOY Clock Valid bit not set

Diagnostic Name        ID             Device  Pass  Test  Hard/Soft   9-JAN-2061
io_test          0000003e                toy     1     2     0    2     13:26:29
Expected value:80  Received value: 0
Failing addr:    d

*** End of Error ***

*** Error (eza0), Mop loop message timed out from: 08-00-2b-3d-df-18

*** List index: 0 received count: 0 expected count 1

+------------------------------------------------------------------------------+
¦   Console V4.0-1                    VMS PALcode V5.56A, OSF PALcode V1.45A   ¦
¦                                                                              ¦
¦   CPU 0         P    B2001-BA DECchip (tm) 21064-3                           ¦
¦   CPU 1         -                                                            ¦
¦   Memory 0      -                                                            ¦
¦   Memory 1      -                                                            ¦
¦   Memory 2      P    B2002-DA 128 MB                                         ¦
¦   Memory 3      P    B2002-DA 128 MB                                         ¦
¦   Ethernet 0    P    08-00-2B-3D-DF-18                                       ¦
¦   Ethernet 1    P    08-00-2B-3D-DF-17                                       ¦
¦                                                                              ¦
¦                      ID 0   ID 1   ID 2   ID 3   ID 4   ID 5   ID 6   ID 7   ¦
¦   A     SCSI    P    RZ28   RZ28   RZ28   RZ28                 MATSHI Host   ¦
¦   B             P                                                            ¦
¦   C             P                                                            ¦
¦   D             P                                                            ¦
¦   E             F                                                            ¦
¦   Futurebus+    P                                                            ¦
¦                                                                              ¦
¦   System Status Fail        Type 'cat el' to see errors                      ¦
+------------------------------------------------------------------------------+
DEC 4000 AXP (tm) console V4.0-1, built on Apr 13 1998 at 16:21:03
>>>

Initialisation indicated 3 errors: SCSI device, TOY and MOR. The SCSI can probably be attributed to the fact that I removed the whole 'B' drive chassis. TOY is the Time Of Year clock and I'm going to assume that the battery is dead. MOR is pointing to Ethernet port 0.. Nothing is plugged in.

Time Of Year

To fix this error, we needed to remove the KFA40 I/O Module. This is located in the right-most slot on the back of the machine. Extricating the board turned out to be a challenge in itself. It looks like someone had already snapped the brackets that hold the face-plate from the main board. This face-plate includes the levers which 'jimmy' the board out of the rear sockets, and so they no longer successfully applied pressure where required. Instead they just further wrecked the face-plate!

To get the board out, I first tried a hand each on the network BNC sockets, but this was also a dangerous idea as they would only be soldered on. I ended up cutting a coat-hanger and slotting it in at the back of the board, between the two plugs where the board plugs into the back-plane. A slight amount of pressure saw the whole board pop out.

DSC00554 DSC00557 DSC00560

I had a quick glance over it and found nothing that looked like a cell battery. I then went back to the diagram of the board in the Technical Manual and freaked out.

kfa40-io

It's a bloody DS1287! This is the same as the Compaq Deskpro 386/20n. Total nightmare.

DSC00562

I tried to de-solder it, but I'm really not talented in that department. I therefore decided to perform open-chip-surgery on the unit whilst still on the board. If I failed, then I'd hack the thing off entirely and replace it... if I succeeded, then I'd have a coin-cell slot on top for anyone to replace!

DSC00563 DSC00569 DSC00570

Once this was in place... the machine started differently?

VMS PALcode V5.56A, OSF PALcode V1.45A

Lbus & Fbus have been reset and Lbus enabled
initializing timer data structures
lowering IPL
counted 15741191 cycles in 100 ticks
CPU 0 speed is 6.20 ns (161 MHz)
entering idle loop
Starting Memory Diagnostics
Leaving back-to-back transactions turned off
Testing CMIC on Memory Module 2
Turning off the stream buffers
Testing CMIC on Memory Module 3
Turning on the stream buffers
Testing 1st 2MB(s) on memory module 3
Testing all memory banks in parallel
Testing Memory bank 0
Testing Memory bank 1
Testing Memory bank 2
Testing Memory bank 3
Module   Size    Base Addr   Intlv Mode  Intlv Unit
------   -----   ---------   ----------  ----------
  0              Not Installed
  1              Not Installed
  2      128MB   00000000      1-Way         0
  3      128MB   08000000      1-Way         0
Configured memory size = 10000000
Memory Diagnostics completed
access NVRAM
test Script RAM
enable ncr4 ACK
test Storage Bus E
Initializing driver eza0.0.0.6.0.
Driver eza0.0.0.6.0 initialized.
Initializing driver ezb0.0.0.7.0.
enable ncr0 ACK
test Storage Bus A
Driver ezb0.0.0.7.0 initialized.
enable Fbus
Start of FBUS sizer
Fbus sizer completed
environment variable etherneta created
environment variable ethernetb created
enable ncr1 ACK
test Storage Bus B


*** Soft Error - Error #1 - Lower SCSI Continuity Card Missing (connector J7)


Diagnostic Name        ID             Device  Pass  Test  Hard/Soft  153- -2053
io_test          0000003d       scsi_low_con     1     1     0    1     25:153:4
Expected value:fc  Received value: fffffffd
Failing addr:    0

*** End of Error ***


environment variable aa_lp_cnt00000040 created
environment variable aa_value_bcc created
environment variable aa_2x_cache_size created
Warning: ncr1, loopback connector attached  OR
SCSI bus failure, could not acquire bus; Control Lines:ff Data lines:ff
Warning: ncr1 not tested
enable ncr2 ACK
test Storage Bus C
enable ncr3 ACK
test Storage Bus D
DEC 4000 AXP (tm) console V4.0-1, built on Apr 13 1998 at 16:21:03
>>>

From here I started digging for firmware updates as it seemed that I'd taken the system backwards? I randomly landed on Firmware Update Release Notes for v4.0, but calling show config seems to indicate that I already have that version.

>>>show config


     Console V4.0-1                    VMS PALcode V5.56A, OSF PALcode V1.45A

     CPU 0         P    B2001-BA DECchip (tm) 21064-3
     CPU 1         -
     Memory 0      -
     Memory 1      -
     Memory 2      P    B2002-DA 128 MB
     Memory 3      P    B2002-DA 128 MB
     Ethernet 0    P    08-00-2B-3D-DF-18
     Ethernet 1    P    08-00-2B-3D-DF-17

                        ID 0   ID 1   ID 2   ID 3   ID 4   ID 5   ID 6   ID 7
     A     SCSI    P    RZ28   RZ28   RZ28   RZ28                 MATSHI Host
     B             P
     C             P
     D             P
     E             F
     Futurebus+    P

     System Status Fail        Type 'cat el' to see errors

Reading the documentation... v4.0 of the firmware is only good to support up to 4.0D of Tru64 UNIX (well, DIGITAL UNIX.) That can be obtained from here, so let's try that.

...actually... let's get the hardware correct first.

SCSI Drive Bay B

This has always been in the init error logs. I previously removed the actual drive bay chassis to prevent the hard error... but the soft error remained. I initially tried to swap the working disk set into slot B, but it still threw the same error meaning that it was the socket or the back-plane... or the connection through to the IO board. This, of course meant that it was time to pull the thing to bits.

DSC00001 DSC00002 DSC00003

The modularity of the case made tearing it down easy enough to try and diagnose the problem. The machine was apart in no time! Unfortunately nothing obvious came up. This was all the nether-regions of the system that wouldn't have been touched for 25 years. I re-seated a few plugs, but that was about it. Meanwhile, putting it back together took a little longer. It seems that jiggling the cables did the trick though! Upon power up there was no more error B and the power light stayed on! I then reassembled the drives and slapped it all in.

>>>show config


     Console V4.0-1                    VMS PALcode V5.56A, OSF PALcode V1.45A

     CPU 0         P    B2001-BA DECchip (tm) 21064-3
     CPU 1         -
     Memory 0      P    B2002-DA 128 MB
     Memory 1      -
     Memory 2      -
     Memory 3      -
     Ethernet 0    P    08-00-2B-3D-DF-18
     Ethernet 1    P    08-00-2B-3D-DF-17

                        ID 0   ID 1   ID 2   ID 3   ID 4   ID 5   ID 6   ID 7
     A     SCSI    P    RZ28   RZ28   RZ28   RZ28                        Host
     B     SCSI    P    RZ28   RZ28   RZ28M  RZ28                        Host
     C             P
     D             P
     E     SCSI    P                                              MATSHI Host
     Futurebus+    P

     System Status Pass        Type b to boot dkb0.0.0.1.0

Wait, what's that last line? We now have a bootable partition?

>>>b
FMBPR and Fbus devices have been reset
(boot dkb0.0.0.1.0 -flags 0)
block 0 of dkb0.0.0.1.0 is a valid boot block
reading 16 blocks from dkb0.0.0.1.0
bootstrap code read in
base = 1f4000, image_start = 0, image_bytes = 2000
initializing HWRPB at 2000
initializing page table at 1e6000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code
can't open osf_boot

halted CPU 0

halt code = 5
HALT instruction executed
PC = 20000030
>>>

Sure, it's got the boot block... but it's missing files. Back to trying to boot off CD.

Digital UNIX 4.0D

>>>b dka6
(boot dka600.6.0.0.0 -flags 0)
block 0 of dka600.6.0.0.0 is a valid boot block
reading 16 blocks from dka600.6.0.0.0
bootstrap code read in
base = 1f4000, image_start = 0, image_bytes = 2000
initializing HWRPB at 2000
initializing page table at 1e6000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code

Digital UNIX boot - Mon Dec 29 18:50:44 EST 1997

Loading vmunix ...
Loading at fffffc0000230000
Current PAL Revision <0x2000000010538>
Switching to OSF PALcode Succeeded
New PAL Revision <0x200000002012d>

Sizes:
text = 4961776
data = 1324288
bss  = 2884976
Starting at 0xfffffc000042d420

Alpha boot: available memory from 0xf08000 to 0xfffe000
Digital UNIX V4.0D  (Rev
halted CPU 0

halt code = 1
operator initiated halt
PC = fffffc00004351a4
>>>

Tru64 UNIX 5.0

>>>b dka6
(boot dka600.6.0.0.0 -flags A)
block 0 of dka600.6.0.0.0 is a valid boot block
reading 16 blocks from dka600.6.0.0.0
bootstrap code read in
base = 1f4000, image_start = 0, image_bytes = 2000
initializing HWRPB at 2000
initializing page table at 1e6000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code

UNIX boot - Tue Jul 20 21:02:13 EDT 1999

Loading vmunix ...
Loading at 0xffffffff00000000
Mapping Image Address Space: Complete

Sizes:
text = 6284160
data =  1728208
bss  =  1934480
Starting at 0xffffffff00242230

Alpha boot: available memory from 0x121a000 to 0xfffe000
Digital UNIX V5.0 (Rev. 910); Tue Jul 20 22:13:21 EDT 1999
physical memory = 256.00 megabytes.
available memory = 239.99 megabytes.
using 325 buffers containing 2.53 megabytes of memory
emx: dynamic addressing enabled
Firmware revision: 4.0
PALcode: UNIX version 1.45
DEC 4000 Mod
halted CPU 0

halt code = 1
operator initiated halt
PC = ffffffff0024cfa4

Tek Tips has a forum post here on trying to get Tru64 booted. They indicate that you might need to set some parameters first for Unix to boot.

set os_type unix
set auto_action halt
set bootdef_dev ""
set boot_osflags 0
set eia0_mode fastfd
init

Tru64 UNIX 5.1B

>>>b dka6
(boot dka600.6.0.0.0 -flags A)
block 0 of dka600.6.0.0.0 is a valid boot block
reading 15 blocks from dka600.6.0.0.0
bootstrap code read in
base = 1f4000, image_start = 0, image_bytes = 1e00
initializing HWRPB at 2000
initializing page table at 1e6000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code

UNIX boot - Wednesday October 16, 2002

Loading vmunix ...
Loading at 0xffffffff00000000

Sizes:
text =  7922752
data =  2044560
bss  =  2433008
Starting at 0xffffffff00011d70

bcm: DEGXA driver V1.0.6 NUMA lanlog
failed configuring ev7_ocla subsystem
Alpha boot: available memory from 0x1670000 to 0xfffe000
Compaq Tru64 UNIX V5.1B (Rev. 2650); Wed Oct 16 17:45:54 EDT 2002
physical memory = 256.00 megabytes.
available memory = 233.55 megabytes.
using 307 buffers containing 2.39 megabytes of memory
panic (cpu 0): platform not supported by this kernel configuration

DUMP: Warning: no disk available for dump.

DUMP: first crash dump failed: attempting memory dump...
DUMP: compressing 19536KB into 215807KB memory...
DUMP:  Starting Address      E
halted CPU 0

halt code = 1
operator initiated halt
PC = ffffffff800e6be0
>>>

Right, the kernel on the CD is not built for my hardware!
Does this need NHD-7? Seems there's 'New Hardware Delivery' disks that provide additional hardware support... although this doesn't make sense, as 5.1B was released AFTER this hardware came into existence.

Wait... WinWorld has the NHD cds and firmware updates! I'll try these at some point.

OpenVMS 8.4

>>>b dka6
(boot dka600.6.0.0.0 -flags A)
block 0 of dka600.6.0.0.0 is a valid boot block
reading 1230 blocks from dka600.6.0.0.0
bootstrap code read in
base = 1f4000, image_start = 0, image_bytes = 99c00
initializing HWRPB at 2000
initializing page table at 1e6000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code


    Op
halted CPU 0

halt code = 1
operator initiated halt
PC = ffffffff83188df4
>>>

Arrghhhhh... There's an underlying trend here! It seems that every OS, whilst trying to boot, is failing even trying to print out its own name! I bet that Op above is meant to be the start of 'OpenVMS' and, like all the other attempts above, it's throwing a CPU fault whilst initialising the CPU?
I wonder if the CPU is actually faulty? Or the Memory boards?

Ethernet and Boot Server

This forum post has information on updating firmware via the network. Seems that I need a *NIX server with mopd running. The firmware then just needs to be the MAC address of the machine in question with the '.SYS' extension. I used a VM that I still had available running A2SERVER. mopd was installed via apt-get and all was well.

Running this inside a VM on a windows machine has proved problematic before. When I was trying to network boot the 386, the installer from the floppy couldn't see the NFS server in A2SERVER on my other windows PC. I therefore built a real physical Linux box hard-wired into the network. I've chosen to do this again with my Let's Note CF-Y7 Toughbook that's been gathering dust for quite some time.

I installed Linux Mint 18.1 only to find out that 'mopd' wasn't in the package repository. Googling for it, I found that it was last included back at Trusty Tahr. Fortunately, we can work around this. Download the mopd package directly from here (or here if you're running i386) and then also get libelfg0, as this isn't in the newest repo either.

From here, run sudo dpkg -i libelfg0_0.8.13-5_amd64.deb and then sudo dpkg -i mopd_2.5.3-21_amd64.deb. This'll get mopd installed. Run man mopd to see where it expects its server directory to be; in this case /srv/tftp/mopd. Create this directory, grab the firmware file from here and copy it in.

Based on the forum post instructions above, we need the file to be mac_address.SYS. We can get this from the intialisation log: 08-00-2B-3D-DF-18. I therefore ran cp cfw_v40_updp3.sys 08002b3ddf18.SYS. Yes, use lowercase for the address and uppercase for the extension.

steven@letsnote-y7 /srv/tftp/mop $ wget https://modelrail.otenko.com/assets/dec4000axp/cfw_v40_updp3.sys
--2017-05-19 18:37:37--  https://modelrail.otenko.com/assets/dec4000axp/cfw_v40_updp3.sys
Resolving modelrail.otenko.com (modelrail.otenko.com)... 119.15.98.75
Connecting to modelrail.otenko.com (modelrail.otenko.com)|119.15.98.75|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1915904 (1.8M) [application/octet-stream]
Saving to: 'cfw_v40_updp3.sys’

cfw_v40_updp3.sys   100%[===================>]   1.83M  1.21MB/s    in 1.5s    

2017-05-19 18:37:39 (1.21 MB/s) - 'cfw_v40_updp3.sys’ saved [1915904/1915904]

steven@letsnote-y7 /srv/tftp/mop $ cp cfw_v40_updp3.sys 08002b3ddf18.SYS
steven@letsnote-y7 /srv/tftp/mop $ sudo mopd -a -d
mopd: not running as daemon, -d given.
MOP DL 802.3 8:0:2b:3d:df:18   > ab:0:0:1:0:0      len   71 code 08 RPR 
My address is 00:13:e8:2d:c0:b1
MOP DL 802.3 0:13:e8:2d:c0:b1  > 8:0:2b:3d:df:18   len    9 code 03 ASV 
MOP DL 802.3 8:0:2b:3d:df:18   > ab:0:0:1:0:0      len   71 code 08 RPR 
My address is 00:13:e8:2d:c0:b1
MOP DL 802.3 0:13:e8:2d:c0:b1  > 8:0:2b:3d:df:18   len    9 code 03 ASV 
MOP DL 802.3 8:0:2b:3d:df:18   > 0:13:e8:2d:c0:b1  len   71 code 08 RPR 
MOP DL 802.3 0:13:e8:2d:c0:b1  > 8:0:2b:3d:df:18   len 1492 code 02 MLD 
MOP DL 802.3 8:0:2b:3d:df:18   > 0:13:e8:2d:c0:b1  len   71 code 08 RPR 
MOP DL 802.3 0:13:e8:2d:c0:b1  > 8:0:2b:3d:df:18   len 1492 code 02 MLD 
MOP DL 802.3 8:0:2b:3d:df:18   > 0:13:e8:2d:c0:b1  len   46 code 0a RML 
MOP DL 802.3 0:13:e8:2d:c0:b1  > 8:0:2b:3d:df:18   len 1492 code 02 MLD 
MOP DL 802.3 8:0:2b:3d:df:18   > 0:13:e8:2d:c0:b1  len   46 code 0a RML 
MOP DL 802.3 8:0:2b:3d:df:18   > 0:13:e8:2d:c0:b1  len   46 code 0a RML 
MOP DL 802.3 0:13:e8:2d:c0:b1  > 8:0:2b:3d:df:18   len 1492 code 02 MLD
...

And that was the unit actually accepting the firmware and running the update utility!

>>>b eza0
(boot eza0.0.0.6.0 -flags 0)

Trying MOP boot.
.................................

Network load complete.
Host name: ipc
Host address: 00-13-e8-2d-c0-b1

bootstrap code read in
base = 1f4000, image_start = 0, image_bytes = 1d3a00
initializing HWRPB at 2000
initializing page table at 1e6000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code


VMS PALcode V5.56A, OSF PALcode V1.45A

Lbus & Fbus have been reset and Lbus enabled
initializing timer data structures
lowering IPL
counted 15626232 cycles in 100 ticks
CPU 0 speed is 6.25 ns (160 MHz)
entering idle loop
Starting Memory Diagnostics
Leaving back-to-back transactions turned off
Testing CMIC on Memory Module 0
Turning on the stream buffers
Testing all memory banks in parallel
Testing Memory bank 0
Testing Memory bank 1
Testing Memory bank 2
Testing Memory bank 3
Memory size = 8000000
* Warning * Console image larger than Bcache
Memory Configuration skipped
Memory Diagnostics completed
access NVRAM
test Script RAM
enable ncr4 ACK
test Storage Bus E
Initializing driver eza0.0.0.6.0.
Driver eza0.0.0.6.0 initialized.
Initializing driver ezb0.0.0.7.0.
enable ncr0 ACK
test Storage Bus A
Driver ezb0.0.0.7.0 initialized.
enable Fbus
Start of FBUS sizer
Fbus sizer completed
enable ncr1 ACK
test Storage Bus B
enable ncr2 ACK
test Storage Bus C


                ***** Loadable Firmware Update Utility *****
------------------------------------------------------------------------------
 Function    Description
------------------------------------------------------------------------------

 Display     Displays the system's configuration table.
 Exit        Done exit LFU (reset).
 List        Lists the device, revision, firmware name and if found by LFU.
 Update      Replaces current firmware with loadable data image.
 Verify      Compares loadable and hardware images.
 ? or Help   Scrolls this function table.
------------------------------------------------------------------------------

 Type Help  for additional information


UPD> help update
 Update a particular device with LFU's firmware.
   The command format is: UPDATE  [-PATH ]
   For example:
           update *

   Will update all LFU supported devices found in this system

           update io

   Will update the device named IO
   Use the LIST command to see the supported LFU devices

   You can optionally update a device with different firmware than
   defaulted to by LFU, by using the -PATH switch.
   For example:
           update io -path mopdl:new_firm/eza0

   Will update the device named IO with firmware NEW_FIRM from the
   network.


UPD> update *

Confirm update on:
io
[Y/(N)]y
WARNING: updates may take several minutes to complete for each device.

                          DO NOT ABORT!

io              Updating to 4.0...  Verifying 4.0...  PASSED.

UPD> list

device            FW Rev           Filename               Found

io                 4.0             cfw_e43                  Y

UPD> display

                                 Rev                    Events logged
 Slot   Option  Part#           Hw Sw   Serial#         SDD     TDD
   1    IO      B2101-AA        J2 34   AY34507889      00      00
   2
   3    CPU0    B2001-BA        B2 34   AY33479304      00      01
   4    MEM0    B2002-DA        C1 0    GA33306138      00      00
   5
   6
   7

Futurebus+ Nodes
                           Rev
 Slot   Option  Part#    Hw   Fw   Serial#              Description
   1
   2
   3
   4
   5
   6

UPD>

Rebooting showed no changes and CDs still failed when trying to print out their OS names...

Operator Initiated Halt (Update: 19/01/2021)

Ahhh crap. I've just done a 'proper' google for the error message I was getting above: operator initiated halt. It turns out that, one of the first hits is a really cool forum post on community.hpe.com that describes the reason for that error! It seems my actual front-panel HALT button may have been faulty! This hindsight makes me sad... it would've been fun to get the unit up and running and serve a text file to my network... whilst draining the apartment block's power. Such is life.

Filed under: Retro Leave a comment
Comments (0) Trackbacks (0)

No comments yet.


Leave a comment


*

No trackbacks yet.