back
Raycasted tunnels

 Nao's 512 byte demo explained

by Nao, Html by Azure.

This is the source of Naos 512 byte demo explained. In case you need to know how to do a raycasted free directional 3d tunnel this is definitly the place to look for. Also several other tricks like texture generation etc are explained.

This intro performs that boring mapped tunnel that all of us have seen and seen hundreds of times. Now, where is the fun? Try to put all the 'normal' tunnel code in 512 bytes and you will experience several headache.

For those who dont know how this kind of effects works let me say that all the theory is very simple. In few words tunnel is calculated by intersecting rays with tunnel surface (that is a infinte cylinder equation). It's a kind of raytracing. But we'll not calc EVERY pixel on the screen in this way, 'cause it would be too slow, very slow, at least on 68k series. We'll perform calculation only on a 32*32 screen (1024 rays!) and then we'll interpolate this texture value on a 8x8 pixels wide grid. In this manner you can view on your screen a full 256x256 pixels tunnel with no-noticeable loss.

What do we need to realize a such tunnel effect?

1. one palette :-)
2. sin/cos table
3. one texture
4. a simple raytracer
5. a grid expander (to zoom 32x32 screen on a 256x256 one)
6. some code that moves and rotates camera along this damn tunnel :)

```Init:	Lea	Base,a0			;ALL the bss stuff is referred to this pointer
```

First of all we'll generate some texture. I opted for a simple black to blue texture..so dont blame me for this foolish colors :)

```	move.l	a0,a1			;save bss pointer
moveq	#-1,d7			;just 256 color :-)
moveq	#0,d0			;initializes to black (we work on RGB values)
.pal	move.l	d0,(a1)+		;store color
subq.b	#1,d7
bcc.s	.pal			;loop just 256 times
```

Now we have to generate a sin/cos table.;We could just make only a sin table and then extract ;cos values by adding pi/4 period offset on sin pointer.. This way should be also more shorter than the double sin/cos stuff but I cant produce a good routine that calcs only a sin table, I'm a lamer,you know :-) So I used an idea already explained on http://come.to/amiga in one Azure's doc.

In a brief summary this sin/cos work simply applicating infinitesimal rotation to a vector. The vector starts at (1,0) coords..so x will be cos value and y will be the sin one. Applicating an infinetisimal rotation isnt so simple because you can experiment some vey bad accuracy. To fix this problem i've just done some improvements on the classical infinitesimal rotation matrix.

```steps	equ	2048			;sin/cos table steps
factor	equ	steps*10000/2/31415	;fixed point factor

move.l	a1,a2			;saves sin table pointer
move.l	#factor,d5		;this piece of code
move.w	#steps,d0		;generates a sin/cos table
lsr.w	#3,d7			;using an infinitesimal
mulu.w	d5,d0			;rotation matrix
moveq	#0,d2			;u can find all the math stuff..
;on http://come.to/amiga
.scloop	move.l	d2,d3			;look at azure's doc with my final
divs.l	d5,d3			;error correction chapter
move.w	d3,(a1)+		;btw, this isn't the shortest
sub.l	d3,d0			;way i know to generate a sin table
move.l	d0,d1			;but afaik probably
divs.l	d5,d1			;it's the shortest way
move.w	d1,(a1)+		;to make a sin/cos table
dbra	d7,.scloop

```

Now, how to generate a good texture to map on tunnel? There are several ways to accomplish this work..like fractal and procedural generator..but we'll use a 'dumb generator' :-)

First we fill 256x256 texture with random values. Then we'll 'blur' this texture by calculating the average value of every texel, using the top,bottom,left and right texels that lies near our average one. This blur pass will be applicated several times..and in this way we will have the initial random generated mess smoothed..but now we havent something that could be called as a 'good texture'. So..we have to BREAK the 'monotony' of texture by applicating a non linear filter. This non-linear filter just take one texel, multiply this for 4, and the if value it's greater than 127 we'll subtract 127. In this way we'll break the smoothness and texture will be more interesting. Now let's apply all this passes just a few times and will obtain a so called 'decent texture' :-) Another problem to solve it's how can we make out texture tileable. This is not a big problem, all that we have to do it's think our texture space like a finite but unlimitated (ie. no boundaries) space.

```
;* This is V.R.T.G ************************************
;* Very Random Textures Generator :-D *****************

move.l	a1,a0			;saves pointer
moveq	#4,d0
.random	ror.l	d0,d0			;tanx to Azure for this..
addq.l	#7,d0			;smart (and chip) rnd generator
move.b	d0,(a0)+		;fill texture with random values
dbra	d7,.random
moveq	#6-1,d4			;filter passes
.start	moveq	#6-1,d6			;blur passes
move.l	a1,a0			;saves texture pointer
.loopf	move.b	(a0),d0			;load one texel
lsl.b	#2,d0			;*4
bge.s	.ok			;it's greater than 127?
not.b	d0			;yeah..so make it 'reasonable' :)
.ok	move.b	d0,(a0)+		;store texel
dbra	d7,.loopf
.loopbb	move.l	a1,a0			;some pointers sutff
move.l	a1,a5
move.l	#256,d0
move.l	d0,d1
neg.w	d1
.loopb	moveq	#0,d2			;loads 4 texels
move.b	-1(a0),d2		;left
moveq	#0,d3
lsr.w	#2,d3			;perform average
move.b	d3,(a0)+		;and store smoothed texel
addq.w	#1,d0			;increase bottom and top pointers
dbra	d7,.loopb
dbra	d6,.loopbb
dbra	d4,.start

move.l	a1,a5			*saves texture pointer
move.l	a0,a1			*saves grid pointer
lea	(33*33*8)(a1),a4	*saves temp pointer
move.l	a4,a0
lea	(32*4)(a0),a0		*chunky pointer
rts				;this rts can removed
;but who care? :)

;This is dumb camera code..no words on it :-)

VBlank:	; Called every vblank after Init has finished.

move.w	param4(a4),d0		;just...
movem.w	(a2,d0.w*4),d0/d1	;some...
asr.w	#1,d0			;lissajeous..
asr.w	#2,d1			;trick..
move.w	d0,param1(a4)		;to move..
move.w	d1,param2(a4)		;and rotate camera
move.w	(a2,d1.w*4,2048*2*4+384),d0
lsl.w	#3,d0
move.w	d0,param3(a4)
rts

Main:	; Called once when Init has finished.
; Registers are as left by Init.
; If it terminates, the demo will exit.

* A5 --> Texture Pointer
* A4 --> TempData Pointer
* A2 --> Sin/Cos Pointer
* A1 --> Grid Pointer
* A0 --> Chunky Pointer

param1		equ	0
param2		equ	2
param3		equ	4
param4		equ	6

;This it's the main part.
;Simply we call first tracer and then grid expander.
;Camera movements are managed in verical blanck routine.

Main2	bsr.b	Tracer			;call ray tracer
bsr.w	Grid			;call grid expander
bsr.w	Update256x256
bra.b	Main2

rts

```

Here the most interesting part of this intro. Imagine that observer (and you) is located on our universe axis origin (0,0,0). Imagine also that our screen is located in front of us. Screen plane is parallel to plane composed by X and Y axis. Now, our screen is 32x32 wide and so we have to trace 1024 rays, from observer to EVERY pixel on this projecting screen. For every ray we have to calc his intersection with tunnel and than find out what (u,v) texture coords assign at this intersection point. Mind that intersection points is in a 3 dimension space, while texture points lies on a 2 dimension space, so there isnt a general rule (ie. a |R�-->|R� function) to map the texture.

Calculating the ray intersection with the tunnel is very simple. Our ray always starts from (0,0,0) so the general parametric equations to describe that ray are:

x = a*t
y = b*t
z = c*t

where a,b,c are ray coefficients and t it's the only real parameter.

The tunnel is described by the classical infinite tunnel equation. Let z axis be the tunnel axis, so the equation is:

x�+y� = R�

where R it's cylinder radius. NOW, just substitutes ray equations into cilinder one and u'll obtain:

t = +/- (R/sqrt(a�+b�))

We'll discard negative solutions 'cause those intersection points are behind the observer. Now, with t calculated, just substitute t in ray equation to find out intersection point. Very simple and linear.

But how to rotate/move camera and how to choice a good function to map texture on tunnel? We'll see that move tunnel problem it's inherent to texture mapping function problem. So we'll first accomplish to rotate camera stuff. In this intro camera can roll on x and y axis by simply rotate intersection rays! This is a dumb way to perform camera rotation, in fact there are faster ways to do the same thing..but remind, we need short code :)

Now we have to solve the mapping problem. There isnt a canonic way to solve this problem, you can come out with several different solutions, all good, all nice :-) But it's ok that we have to display a nice-to-see tunnel, so we are looking for the best-look function. So, the more used function it's like

f(x,y,z) -> (u,v)

u = atan (y/x)
v = z

but atan function cant be performed in few bytes and we have so little space :-( Ergo, i tried several different solution and I found a good one that has also very very short code :-) just :

u = x^2 (or u = y^2)
v = z

This solution is good when x is a 'big' number..where x is less than one it's became a very lame mapping :-) So we have just solved camera along z traslations, in fact we can just add a traslation value to v mapping coordinate and we can magically traslate our camera..nice? :-) ;let's start with some code:

```
Tracer	movem.l	d0-a6,-(sp)
moveq	#32/2,d7	;y loop counter
moveq	#-32/2,d1	;we start at the upper left screen corner
.y	moveq	#-32/2,d0	;to trace our rays
.x	moveq	#26,d2		;screen distance by observer (ie. focal lenght!)
;bigger value make it zoomed,
;lower make it seems look trought to a
;fish eye lens :-)

move.w	param1(a4),d3	;rotates ray on X axis
and.w	#\$7ff,d3	;make angle fit into table range
movem.w	(a2,d3.w*4),d3/d4 ; get sin(a) and cos(a)
move.l	d3,d5
move.l	d4,d6
muls.w	d2,d3		;z*sin
muls.w	d1,d4		;y*cos
muls.w	d1,d5		;y*sin
muls.w	d2,d6		;z*cos
sub.l	d3,d4		;Y

move.l	d4,-(sp)	;saves Y value

move.w	param2(a4),d3	;rotate ray on Y axis
and.w	#\$7ff,d3
movem.w	(a2,d3.w*4),d3/d5 ;get sin(b) and cos(b)
move.l	d3,d4
move.l	d5,d2
muls.w	d0,d3		;x*sin
muls.l	d6,d4		;z*sin
muls.w	d0,d2		;x*cos
muls.l	d6,d5		;z*cos
moveq	#11,d6
asr.l	d6,d4
asr.l	d6,d5
sub.l	d3,d5		;Z

move.l	(sp)+,d6	;gets Y value and..
move.l	d5,-(sp)	;saves the Z ones

move.l	d6,d5		;now
move.l	d4,d6		;we
asr.l	#6,d5		;just
asr.l	#6,d4		;calc
muls.w	d5,d5		;a^2
muls.w	d4,d4		;b^2
addq.l	#1,d5		;to avoid division by zero :-)

```

Now, how do sqrt? I used this routine coded by an unknown archimedes coder and then adapted on 6502 and 680x0 by Graham. How it works? find it by yourself :-) ;btw..just think that an integer number can be written as a product of several factor all equal to 4 and one other real factor.. :-)

```   moveq   #1,d3
ror.l   #2,d3
moveq   #32,d2
.l2n
move.l  d3,d4
rol.l   d2,d4
cmp.l   d4,d5
bcs.b   .no
sub.l   d4,d5
.no
subq.w  #2,d2
bgt.b   .l2n

;Now we have just to perform last calculation..

.ok	move.l	(sp)+,d4	;i know that this isn't the academic
asl.l	#5,d4		;way to make tunnels, but that's the
asl.l	#5,d6		;only way i found that eliminates
divs.w	d3,d6		;arctan calculation and short both
divs.w	d3,d4		;(my atan routine is 100 bytes long....)
add.w	param3(a4),d4	;Z axis camera traslation
move.w	d4,(a1)+	;just write (u,v)
move.w	d6,(a1)+

cmp.w	d7,d0
ble.w	.x
cmp.w	d7,d1
ble.w	.y
movem.l	(sp)+,d0-a6
rts

```

Here comes the routine that puts all the stuff on screen. it's very simple..just think to dispose all (u,v) calculated with tracer on a grid made by 8x8 pixel square. At every knot you will find a differente (u,v) value. Now we have to interpolate this values from knot to knot for every little square. I dont bother you with interpolation formulas that are just dumb

```*A5-> texture pointer*
*A1-> 33x33 (u,v) grid pointer*
*A0-> chunky buffer*

;I know that actually this routine it's slow
;but this is a short code compo, that isn't? :)

Lattice	movem.l	d0-a6,-(sp)
moveq	#32-1,d7		;y loop counter
moveq	#0,d0
.scanli	swap	d7
move.w	#32-1,d7		;x loop counter

.square	move.l	a0,a6
move.l	(33*4)(a1),a4		;(u4,v4)
move.l	(a1)+,d1		;(u1,v1)
move.l	(33*4)(a1),a3		;(u3,v3)
move.l	(a1),d2			;(u2,v2)
sub.l	d1,a4			;(u4-u1,v4-v1)
sub.l	d2,a3			;(u3-u2,v3-v2)
lsl.l	#3,d1			;instead divide by 8 all the increments..
lsl.l	#3,d2			;we just multiply by 8 all the offset :)
;in this way we have not accuracy loss.
moveq	#8-1,d6

.Yspan	move.l	d1,d3			;(uL,vL)
move.l	d2,d4			;(uR,vR)
swap	d6
sub.l	d3,d4			;(uR-uL,vR-vL)
asr.l	#3,d4
lsl.w	#3,d4
asr.w	#3,d4

.Xspan	move.w	d3,d0			;need explanations
rol.l	#8,d3			;this loop? :)
move.b	d3,d0
ror.l	#8,d3
move.b	(a5,d0.l),(a6)+		;just do it :-)
subq.w	#1,d6
bne.s	.Xspan

lea	256-8(a6),a6		;next span
swap	d6
dbra	d6,.Yspan

addq.l	#8,a0			;next nice little square
dbra	d7,.square
lea	(256*7)(a0),a0
swap	d7
dbra	d7,.scanli
movem.l	(sp)+,d0-a6
rts

;And that's all Folks! :-)

;; ******************** BSS area ********************

section	BSS_Area,bss

; declare BSS vars here. and only here

Base
Palette		ds.l	256
Sin		ds.l	2048*4
Texture		ds.b	256*256
Grid		ds.l	33*33*2
TempData	ds.l	32
Chunky		ds.b	256*256
```

All code by nAo/darkAge (when I wrote this i was ramjam :-) ) except where is differently specificated.