By Date: <-- -->
By Thread: <-- -->

[PATCH] USB: add zr364xx V4L2 driver



On Wed, 14 Feb 2007, Alan wrote:
> > > My comment is not very good, in fact on some cameras I need to swap the bytes
> > > to have correct JPEG data (so this is not an endianness issue I think).
> > > Maybe there is a macro to swap bytes in a buffer? I cannot find it.
> >
> > Sorry, there's a swab32, but no swab16. I misremembered.
>
> Its just called "swab" for 16bit values and is a gcc builtin/string
> function.

The C library function swab() isn't usable in the kernel, as it's not part
of the kernel's C lib.

Gcc doesn't have a builtin swab/bswap16 yet, maybe it will someday:
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg00496.html

The kernel does have swab64, swab32, and yes, swab16 macros!  They're all
defined in the same place in asm/byteorder.h.  There are architecture
optimized versions for some cases, but not for x86 and swab16 as gcc
supposedly does ok (or does it? *).

There are three versions of the swabXX functions, a normal one, one that
takes a pointer to the data, and one that swaps the data in-place.  The
more specialized versions might be faster in some cases.  I don't see any
version of that swaps an array of data, like C-lib swab(), which be a lot
more useful that swab16 vs swab16p vs swab16s, IMHO.

uint16_t *p)
*p = swab16(*p);  // one way
*p = swab16p(p);  // maybe better
swab16s(p);       // best

>> +            /* swap to good indian if camera needs it */
>> +            if (cam->method == 0)
>> +                    for (i = 0; i < BUFFER_SIZE; i += 2) {
>> +                            swap = cam->buffer[i];
>> +                            cam->buffer[i] = cam->buffer[i + 1];
>> +                            cam->buffer[i + 1] = swap;
>> +                    }

+            /* swap to good endian if camera needs it */
+            if (cam->method == 0)
+                    for (i = 0; i < BUFFER_SIZE/2; i++) {
+                            swab16s((uint16_t*)cam->buffer +i);
+                    }

or

+            /* swap to good native american if camera needs it */
+            if (cam->method == 0) {
+                    uint16_t *buf = cam->buffer;
+                    for (i = 0; i < BUFFER_SIZE/2; i++)
+                            swab16s(buf++);
+            }


*** Does gcc really optimize swab16() well?

Compiled this with gcc 4.0.1 for athlon (using 2.6.20's compiler options):
void bar(uint16_t *p)
{
    int i;
    for(i=0;i<127;i++)
        swab16s(p + i);
}

Resulting asm code does not look that good to me.  gcc does a copy, two
shifts, and then an or to effect the swab16.  Surely rotating a 16-bit
register would be faster?  There shouldn't be any partial register stalls.
I don't see why gcc decides to add two the pointer, then offset it by -2
when it uses it.  What's the point of that?

bar:
        pushl   %ebx    #
        movl    $1, %ebx        # %ebx = i
        leal    2(%eax), %ecx   # %ecx = p+2, why add 2?  just use eax
        .p2align 4,,7
.L21:
        movzwl  -2(%ecx), %eax  # Have to offset by -2
        incl    %ebx            #
        movl    %eax, %edx      # do the swab16
        sall    $8, %edx        #
        shrl    $8, %eax        #
        orl     %eax, %edx      #
        movw    %dx, -2(%ecx)   #
        addl    $2, %ecx        # why not skip this and use (%ecx,%ebx,2)
        cmpl    $128, %ebx      # counting from -128...0 would avoid this
        jne     .L21
        popl    %ebx            # used too many registers
        ret

Anyway, surely this would be faster:
bar:
	movl	$-128, %ecx	# start at -128, count to 0
	add	$256, %eax	# (p+256)[-128] == p[0]
	.p2align 4,,7
.L21:
	movzwl	(%eax,%ecx,2), %edx
	rorw	$8, %dx		# all that's needed for swab16
	movw	%dx, (%eax,%ecx,2)
	inc	%ecx
	jnz	.L21
	ret

Ok, the loop optimization is a little hard for gcc, but isn't it supposed
to be able to figure out "rorw $8, %reg"?

--
video4linux-list mailing list
Unsubscribe mailto:video4linux-list-request (at) redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/video4linux-list