[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

"AIscm" array JIT

From: Jan Wedekind
Subject: "AIscm" array JIT
Date: Wed, 8 Jun 2016 22:16:04 +0100 (BST)
User-agent: Alpine 2.11 (DEB 23 2013-08-11)

I am working on a compact library [1] for JIT compilation of array operations. It only runs on AMD64 processors. Currently it supports array operations using booleans, integers, and integer RGB and integer complex numbers. There are still important things missing: floating point numbers, compiling calls to C methods (e.g. sin, cos, ...), tensor operations, convolutions, ... I would like to eventually do numerical processing similar to Python's NumPy (but more generic), Theano (but more compact syntax as facilitated by macros), and OpenCV. Here is an example adding an integer to each element of a 2D array and returning the result:

    scheme@(guile-user)> (use-modules (oop goops) (aiscm jit) (aiscm int)
                                      (aiscm pointer) (aiscm sequence))
    scheme@(guile-user)> (+ (arr (2 3 5) (7 11 13)) 3)
    $1 = #<sequence<sequence<int<8,unsigned>>>>:
    ((5 6 8)
     (10 14 16))

The fallback method for the GOOPS generic "+" adds a JIT compiled plus operation for the specific array types to the generic and then calls "+" again. The corresponding machine code to produce the unsigned byte array is shown below:

     0:   4c 89 64 24 f0          mov    QWORD PTR [rsp-0x10],r12
     5:   48 89 6c 24 e8          mov    QWORD PTR [rsp-0x18],rbp
     a:   4c 89 7c 24 e0          mov    QWORD PTR [rsp-0x20],r15
     f:   4c 89 74 24 d8          mov    QWORD PTR [rsp-0x28],r14
    14:   4c 89 6c 24 d0          mov    QWORD PTR [rsp-0x30],r13
    19:   48 89 5c 24 c8          mov    QWORD PTR [rsp-0x38],rbx
    1e:   48 89 7c 24 f8          mov    QWORD PTR [rsp-0x8],rdi
    23:   4c 8b 64 24 08          mov    r12,QWORD PTR [rsp+0x8]
    28:   48 8b 7c 24 18          mov    rdi,QWORD PTR [rsp+0x18]
    2d:   48 8b 6c 24 20          mov    rbp,QWORD PTR [rsp+0x20]
    32:   8a 44 24 28             mov    al,BYTE PTR [rsp+0x28]
    36:   48 6b de 01             imul   rbx,rsi,0x1
    3a:   49 8b f0                mov    rsi,r8
    3d:   4d 6b cc 01             imul   r9,r12,0x1
    41:   4c 8b fd                mov    r15,rbp
    44:   49 be 00 00 00 00 00    movabs r14,0x0
    4b:   00 00 00
    4e:   4c 8b 44 24 f8          mov    r8,QWORD PTR [rsp-0x8]
    53:   4d 3b f0                cmp    r14,r8
    56:   74 3e                   je     0x96
    58:   49 ff c6                inc    r14
    5b:   4c 6b d9 01             imul   r11,rcx,0x1
    5f:   4c 8b ee                mov    r13,rsi
    62:   4c 6b d7 01             imul   r10,rdi,0x1
    66:   4d 8b e7                mov    r12,r15
    69:   48 bd 00 00 00 00 00    movabs rbp,0x0
    70:   00 00 00
    73:   48 3b ea                cmp    rbp,rdx
    76:   74 16                   je     0x8e
    78:   48 ff c5                inc    rbp
    7b:   45 8a 04 24             mov    r8b,BYTE PTR [r12]
    7f:   44 02 c0                add    r8b,al
    82:   45 88 45 00             mov    BYTE PTR [r13+0x0],r8b
    86:   4d 03 eb                add    r13,r11
    89:   4d 03 e2                add    r12,r10
    8c:   eb e5                   jmp    0x73
    8e:   48 03 f3                add    rsi,rbx
    91:   4d 03 f9                add    r15,r9
    94:   eb b8                   jmp    0x4e
    96:   4c 8b 64 24 f0          mov    r12,QWORD PTR [rsp-0x10]
    9b:   48 8b 6c 24 e8          mov    rbp,QWORD PTR [rsp-0x18]
    a0:   4c 8b 7c 24 e0          mov    r15,QWORD PTR [rsp-0x20]
    a5:   4c 8b 74 24 d8          mov    r14,QWORD PTR [rsp-0x28]
    aa:   4c 8b 6c 24 d0          mov    r13,QWORD PTR [rsp-0x30]
    af:   48 8b 5c 24 c8          mov    rbx,QWORD PTR [rsp-0x38]
    b4:   c3                      ret

  Any comments,suggestions, and feedback are welcome!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]