Results 1 to 3 of 3

Thread: Carry in/out add and borrow in/out sub

  1. #1
    Senior Member
    Join Date
    Mar 2011
    Location
    Seoul
    Posts
    118

    Carry in/out add and borrow in/out sub

    As far as I know there is no support for inline assembly in OpenCL; however I would really like some kind of access to 32-bit (and 64-bit if possible) carry/borrow add and sub functions with carry/borrow in and out.

    In the meantime, are there any efficient methods to performing such operations without many conditional tests or arithmetic operations?

    Also, since there is no operator overloading in OpenCL, developing emulated basic data types (say 256 bit integers or floating-point numbers) is entirely possible but exceedingly cumbersome. Since 128 bit integers (long long) and floating-point numbers (quad) are reserved for possible future inclusion, how does Khronos plan to implement them?

  2. #2
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: Carry in/out add and borrow in/out sub

    A bit of googling found this presentation: http://dl.fefe.de/bignum.pdf that shows how different arbitrary-precision arithmetic libraries implement bignum.

    See slide 6 in particular. Translated to OpenCL it says: for addition use 64-bit integers but compute 32-bits at a time. That way you will have access to carry bits.

    Code :
    for (l=0, i=0; i<m; ++i) {
        l += (ulong)src1[i] + (ulong)src2[i];
        dest[i] = l;
        l >>= 32;
    }

    Multiplication is oh so interesting However, there's no problem here that is new to OpenCL: the same issues would appear in any implementation of bignum that doesn't use assembly.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  3. #3
    Senior Member
    Join Date
    Mar 2011
    Location
    Seoul
    Posts
    118

    Re: Carry in/out add and borrow in/out sub

    Nice link.

    I was trying something very similar to slide 6, but since unsigned long long isn't yet implemented in OpenCL I had nothing to directly catch the overflow. I was trying to do conditional tests but tended to branch too much or repeat the same calculation using the functions hadd or rhadd. The way you pointed out uses a smaller basic data type but is much more concise.

    Unfortunately many of those more interesting multiplication algorithms are recursive in nature, but if I'm progressively making 128, 256, and then 512 I can use .hi and .lo and call down the recursion more or less inherently. Tackling division is an even greater joy .

Similar Threads

  1. How to add header?
    By dave88 in forum OpenCL
    Replies: 4
    Last Post: 04-20-2010, 05:53 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •