News:

LATEST RELEASE:  FPP 6.1 - Download from here - https://github.com/FalconChristmas/fpp/releases/tag/6.1

+-+-

+-User

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?

+-Site Stats

Members
Total Members: 15524
Latest: fighteriris
New This Month: 32
New This Week: 20
New Today: 3
Stats
Total Posts: 127342
Total Topics: 15630
Most Online Today: 135
Most Online Ever: 7634
(January 21, 2020, 02:14:03 AM)
Users Online
Members: 10
Guests: 49
Total: 59

Beaglebone PRU code

Started by charleskerr, January 23, 2015, 03:17:19 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

charleskerr

David,
  Per your request, this is the Beaglebone PRU code I use for DMX/2811/SSD (1MBaud speed serial similar to Pixelnet/DMX without the header, and use of standard break, 3072 channels) output on the PRU.  It uses direct IO interface (the direct register mapped R30/R31 output pins, so ones pin count is limited. It is one serial output per PRU (all I needed, so I didn't worry about multiplexing).

The 8K of PRU memory is used with a header defining the output type, number of channels (or pixels if WS2812 is selected), and a flag indicating data should be read from the rest of the memory and outputted.  If you want to keep a constant or timeout data retransmission, that is expected to be done at the ARM level (not part of the PRU code).  If pixel data, it is expected the data will be in RGB format. Since this outputs to my 2812 (which is GRB), it swaps those channels before the data is sent out.  The timing for WS2812 is not per the spec sheet (adjusted by the constants) but to what I found to work well for the pixels I make out of the raw chips.

I don't know if you need/want the device tree overlay I use for this configuration (to configure the two pins, one per PRU, to the direct mapped high speed mode).

Ok, now to the code

First the constants for the memory

#define DMX_OUTPUT (0)
#define SSD_OUTPUT (1)
#define WS2812_OUTPUT (2)

// Index we use
#define OUTPUT_TYPE_INDEX   (0)
#define OUTPUT_BIT_REG   (4)         // What output pin are we using on the PRU?
#define DATA_READY_INDEX  (8)
#define OUTPUT_COUNT_INDEX  (12)
#define HEADER_SIZE (16)
#define DATA_INDEX (16)


What I use to setup the ARM code class that handles PRU output

void PRUOutput::init()
{
    std::system("/sbin/modprobe uio_pruss  2>/dev/null");
    tpruss_intc_initdata pruss_intc_initdata = PRUSS_INTC_INITDATA;
    /* Initialize the PRU */
    prussdrv_init ();
   
    int ret = prussdrv_open(PRU_EVTOUT_1);
    if (!ret)
    {
        /* Get the interrupt initialized */
        prussdrv_pruintc_init(&pruss_intc_initdata);
       
        // Map the memory
        prussdrv_map_prumem (PRUSS0_PRU0_DATARAM, &_pruDataMem0);
        _pruData[0] = reinterpret_cast<unsigned char*>(_pruDataMem0);
        prussdrv_map_prumem (PRUSS0_PRU1_DATARAM, &_pruDataMem1);
        _pruData[1] = reinterpret_cast<unsigned char*>(_pruDataMem1);
        prussdrv_map_prumem (PRUSS0_SHARED_DATARAM, &_pruDataSharedMem);
       
       
        // Pointers are setup, we need to set initial memory
        std::memset(_pruData[0],0,HEADER_SIZE);
        std::memset(_pruData[1],0,HEADER_SIZE);
        std::memset(_pruData[0]+DATA_INDEX,0,MAX_BUFFER_SIZE+3);
        std::memset(_pruData[1]+DATA_INDEX,0,MAX_BUFFER_SIZE+3);
       
        *reinterpret_cast<int32_t*>((_pruData[0]+ OUTPUT_TYPE_INDEX))=*reinterpret_cast<int32_t*>((_pruData[1]+ OUTPUT_TYPE_INDEX))=DMX_OUTPUT;
        *reinterpret_cast<int32_t*>((_pruData[0]+ DATA_READY_INDEX))=*reinterpret_cast<int32_t*>((_pruData[1]+DATA_READY_INDEX))=0;
        *reinterpret_cast<int32_t*>((_pruData[0]+ OUTPUT_COUNT_INDEX))=*reinterpret_cast<int32_t*>((_pruData[1]+OUTPUT_COUNT_INDEX))=512;
        *reinterpret_cast<int32_t*>((_pruData[0]+ OUTPUT_BIT_REG))=14;
        *reinterpret_cast<int32_t*>((_pruData[1]+ OUTPUT_BIT_REG))=1;
        _bufferSize[0]=_bufferSize[1]=512;
        int index=0;
        for (index=0; index<2; index++)
        {
            if (_loaded[index])
            {
                // std::cout <<"Type for index " <<index <<" is "<<_type[index]<<std::endl;
                if (_type[index]==std::string("2812"))
                {
                    // std::cout <<"We think 2812 for index "<<index<<std::endl;
                    int pixel=((_endChannel[index]-_startChannel[index]) + 1 + 2)/3;
                    //if (index==0)
                    {
                        *reinterpret_cast<int32_t*>((_pruData[index]+ OUTPUT_TYPE_INDEX))= WS2812_OUTPUT;
                        *reinterpret_cast<int32_t*>((_pruData[index]+ OUTPUT_COUNT_INDEX))=pixel;
                        _bufferSize[index]=pixel*3;
                    }
                }
                else if (_type[index]==std::string("SSD"))
                {
                    //std::cout <<"we think ssd for index " <<index<<std::endl;
                    *reinterpret_cast<int32_t*>((_pruData[index]+ OUTPUT_TYPE_INDEX))= SSD_OUTPUT;
                    *reinterpret_cast<int32_t*>((_pruData[index]+ OUTPUT_COUNT_INDEX))=3072;
                    _bufferSize[index]=3072;
                }
                else if (_type[index]==std::string("DMX"))
                {
                    // We dont need to do anything, we pre configure for DMX
                }
                //std::cout <<"Starting pru " <<index<<std::endl;
                ret=prussdrv_exec_program (index, "./pru.bin");
                if(ret!=0)
                {
                    std::cerr <<"Error starting pru "<<index<<" error "<<ret<<std::endl;
                }
               
            }
        }
    }
}


ARM code routines that handle copying down memory from the ARM to the PRU memory buffers and sets the ready flag


//=============================================================================
void PRUOutput::outputData(const char *data, int32_t frameLength)
{
    int32_t index;
    for (index=0; index<2; index++)
    {
        if (_loaded[index])
        {
            if (data!=NULL)
            {
                int32_t stoppoint=_endChannel[index];
                if (frameLength<stoppoint)
                {
                    stoppoint=frameLength;
                }
                int32_t length = (stoppoint-_startChannel[index])+1;
                if (length>0)
                {
                    if (length>_bufferSize[index])
                    {
                        length=_bufferSize[index];
                    }
                    std::memcpy(_pruData[index]+DATA_INDEX,data+_startChannel[index]-1,length);
                    if (length<_bufferSize[index])
                    {
                        std::memset(_pruData[index]+DATA_INDEX+stoppoint, 0, (_bufferSize[index]-length)+1);
                    }
                }
            }
           
        }
       
    }
    resend();
}
//=============================================================================
void PRUOutput::resend()
{
    *(_pruData[0]+DATA_READY_INDEX)=1;
    *(_pruData[1]+DATA_READY_INDEX)=1;
}



And now for the pru code itself


.setcallreg r29.w0
.origin 0
.entrypoint START

#define AM33XX

// Refer to this mapping in the file - \prussdrv\include\pruss_intc_mapping.h
#define PRU0_PRU1_INTERRUPT     17
#define PRU1_PRU0_INTERRUPT     18
#define PRU0_ARM_INTERRUPT      19
#define PRU1_ARM_INTERRUPT      20
#define ARM_PRU0_INTERRUPT      21
#define ARM_PRU1_INTERRUPT      22

#define CONST_PRUDRAM   C24
#define CONST_SHAREDRAM C28
#define CONST_L3RAM     C30
#define CONST_DDR       C31

// Address for the Constant table Programmable Pointer Register 0(CTPPR_0)
#define CTBIR_0         0x22020
// Address for the Constant table Programmable Pointer Register 0(CTPPR_0)
#define CTBIR_1         0x22024

// Address for the Constant table Programmable Pointer Register 0(CTPPR_0)
#define CTPPR_0         0x22028
// Address for the Constant table Programmable Pointer Register 1(CTPPR_1)
#define CTPPR_1         0x2202C


// Registers we use for our code
#define SCRATCH0 r2
#define LOW_RATE r3
#define OUTPUT_COUNT r5
#define BAUD_DURATION r6
#define SLEEP_TIMER r7
#define DATA_READY r8
#define SLEEP_SCRATCH r14
#define OUTPUT_TYPE r9
#define INDEX_REG r10
#define BYTE_VALUE r11
#define PIXEL_TYPE r12
#define REG_OUTPUT r13


// Index we use
#include "MemOffset.h"

.macro SLEEPUS
.mparam us,inst,lab
    MOV SLEEP_TIMER, (us*100)-1-inst
lab:
    SUB SLEEP_TIMER, SLEEP_TIMER, 1
    QBNE lab, SLEEP_TIMER, 0
.endm

.macro SLEEPUSREG
.mparam us,inst,lab
MOV SLEEP_TIMER, us
lab:
    SUB SLEEP_TIMER, SLEEP_TIMER, 1
    QBNE lab, SLEEP_TIMER, 0
.endm

.macro SENDBIT
.mparam reg,bit,lab,lab2,lab3
        QBBC lab, BYTE_VALUE,bit
        SET r30, REG_OUTPUT
        QBA lab2
lab:
CLR r30,REG_OUTPUT

lab2:
SLEEPUSREG BAUD_DURATION,1,lab3
.endm

.macro SENDPBIT
.mparam reg,bit,lab,lab2,lab3,lab4
        QBBC lab, BYTE_VALUE,bit
        mov BAUD_DURATION,70
        mov LOW_RATE,60
        QBA lab2
lab:
        mov BAUD_DURATION, 35
        mov LOW_RATE,80
lab2:
        SET r30,REG_OUTPUT
        SLEEPUSREG BAUD_DURATION,1,lab3
        CLR R30,REG_OUTPUT
        SLEEPUSREG LOW_RATE,1,lab4
.endm

START:
//      Setup our memory
LBCO    r0, C4, 4, 4
CLR     r0, r0, 4         // Clear SYSCFG[STANDBY_INIT] to enable OCP master port
SBCO    r0, C4, 4, 4


// Get our fixed values
LBCO REG_OUTPUT, CONST_PRUDRAM, OUTPUT_BIT_REG,4
    LBCO OUTPUT_COUNT, CONST_PRUDRAM, OUTPUT_COUNT_INDEX, 4

    LBCO PIXEL_TYPE, CONST_PRUDRAM,OUTPUT_TYPE_INDEX,4
QBEQ DMXOUT, PIXEL_TYPE, DMX_OUTPUT
QBEQ SSDOUT, PIXEL_TYPE, SSD_OUTPUT
QBEQ WS2812OUT, PIXEL_TYPE, WS2812_OUTPUT
JMP START // None of the types we support, keep looking

DMXOUT:
MOV BAUD_DURATION, 400
JMP UARTOUT
SSDOUT:
MOV BAUD_DURATION, 100
UARTOUT:
// Set our output high
SET r30, REG_OUTPUT
// Move our offset for our data ready flag
UWAIT:
SLEEPUS 400,1,UWAITRDY
    LBCO DATA_READY, CONST_PRUDRAM, DATA_READY_INDEX, 4
QBEQ UWAIT, DATA_READY, 0     // If the flag is zero, loop

// It wasn't zero, we need to clear the flag
MOV DATA_READY,0       
SBCO DATA_READY, CONST_PRUDRAM, DATA_READY_INDEX, 4
// Flag is now cleared, we have to actually get our data
// First, send a break and 0


    CLR r30, REG_OUTPUT
    SLEEPUS 92,1,BREAK_LOW
    SET r30, REG_OUTPUT
    SLEEPUS 12,1,BREAK_HIGH
       
    MOV BYTE_VALUE,0
    CALL SNDBYTE
MOV SCRATCH0,OUTPUT_COUNT
MOV INDEX_REG,DATA_INDEX
DLOOP:
LBCO BYTE_VALUE, CONST_PRUDRAM, INDEX_REG,1
CALL SNDBYTE
ADD INDEX_REG,INDEX_REG,1
SUB SCRATCH0, SCRATCH0, 1
QBNE DLOOP, SCRATCH0, 0

    JMP UWAIT
   
WS2812OUT:
    CLR r30, REG_OUTPUT
   
PIXEL:
SLEEPUS 400,1,PWAITRDY
    LBCO DATA_READY, CONST_PRUDRAM, DATA_READY_INDEX, 4
QBEQ PIXEL, DATA_READY, 0     // If the flag is zero, loop
// It wasn't zero, we need to clear the flag
MOV DATA_READY,0       
SBCO DATA_READY, CONST_PRUDRAM, DATA_READY_INDEX, 4

MOV SCRATCH0,OUTPUT_COUNT
MOV INDEX_REG, DATA_INDEX-3
PLOOP:
ADD INDEX_REG, INDEX_REG, 3
LBCO BYTE_VALUE, CONST_PRUDRAM, INDEX_REG,3
SENDPBIT BYTE_VALUE,15,PBIT15,PBIT15SLEEP,PBIT15_SLEEP,PBIT_15_SLEEP
SENDPBIT BYTE_VALUE,14,PBIT14,PBIT14SLEEP,PBIT14_SLEEP,PBIT_14_SLEEP
SENDPBIT BYTE_VALUE,13,PBIT13,PBIT13SLEEP,PBIT13_SLEEP,PBIT_13_SLEEP
SENDPBIT BYTE_VALUE,12,PBIT12,PBIT12SLEEP,PBIT12_SLEEP,PBIT_12_SLEEP
SENDPBIT BYTE_VALUE,11,PBIT11,PBIT11SLEEP,PBIT11_SLEEP,PBIT_11_SLEEP
SENDPBIT BYTE_VALUE,10,PBIT10,PBIT10SLEEP,PBIT10_SLEEP,PBIT_10_SLEEP
SENDPBIT BYTE_VALUE,9,PBIT9,PBIT9SLEEP,PBIT9_SLEEP,PBIT_9_SLEEP
SENDPBIT BYTE_VALUE,8,PBIT8,PBIT8SLEEP,PBIT8_SLEEP,PBIT_8_SLEEP
SENDPBIT BYTE_VALUE,7,PBIT7,PBIT7SLEEP,PBIT7_SLEEP,PBIT_7_SLEEP
SENDPBIT BYTE_VALUE,6,PBIT6,PBIT6SLEEP,PBIT6_SLEEP,PBIT_6_SLEEP
SENDPBIT BYTE_VALUE,5,PBIT5,PBIT5SLEEP,PBIT5_SLEEP,PBIT_5_SLEEP
SENDPBIT BYTE_VALUE,4,PBIT4,PBIT4SLEEP,PBIT4_SLEEP,PBIT_4_SLEEP
SENDPBIT BYTE_VALUE,3,PBIT3,PBIT3SLEEP,PBIT3_SLEEP,PBIT_3_SLEEP
SENDPBIT BYTE_VALUE,2,PBIT2,PBIT2SLEEP,PBIT2_SLEEP,PBIT_2_SLEEP
SENDPBIT BYTE_VALUE,1,PBIT1,PBIT1SLEEP,PBIT1_SLEEP,PBIT_1_SLEEP
SENDPBIT BYTE_VALUE,0,PBIT0,PBIT0SLEEP,PBIT0_SLEEP,PBIT_0_SLEEP
SENDPBIT BYTE_VALUE,23,PBIT23,PBIT23SLEEP,PBIT23_SLEEP,PBIT_23_SLEEP
SENDPBIT BYTE_VALUE,22,PBIT22,PBIT22SLEEP,PBIT22_SLEEP,PBIT_22_SLEEP
SENDPBIT BYTE_VALUE,21,PBIT21,PBIT21SLEEP,PBIT21_SLEEP,PBIT_21_SLEEP
SENDPBIT BYTE_VALUE,20,PBIT20,PBIT20SLEEP,PBIT20_SLEEP,PBIT_20_SLEEP
SENDPBIT BYTE_VALUE,19,PBIT19,PBIT19SLEEP,PBIT19_SLEEP,PBIT_19_SLEEP
SENDPBIT BYTE_VALUE,18,PBIT18,PBIT18SLEEP,PBIT18_SLEEP,PBIT_18_SLEEP
SENDPBIT BYTE_VALUE,17,PBIT17,PBIT17SLEEP,PBIT17_SLEEP,PBIT_17_SLEEP
SENDPBIT BYTE_VALUE,16,PBIT16,PBIT16SLEEP,PBIT16_SLEEP,PBIT_16_SLEEP
SUB SCRATCH0, SCRATCH0, 1
QBNE PLOOP, SCRATCH0, 0

    //SLEEPUS 50,1,PIXEL_RESET
    JMP PIXEL

       
SNDBYTE:

    CLR r30, REG_OUTPUT
    SLEEPUSREG BAUD_DURATION,1,START_LOW
SENDBIT BYTE_VALUE,0,BIT0,BIT0SLEEP,BIT0_SLEEP
SENDBIT BYTE_VALUE,1,BIT1,BIT1SLEEP,BIT1_SLEEP
SENDBIT BYTE_VALUE,2,BIT2,BIT2SLEEP,BIT2_SLEEP
SENDBIT BYTE_VALUE,3,BIT3,BIT3SLEEP,BIT3_SLEEP
SENDBIT BYTE_VALUE,4,BIT4,BIT4SLEEP,BIT4_SLEEP
SENDBIT BYTE_VALUE,5,BIT5,BIT5SLEEP,BIT5_SLEEP
SENDBIT BYTE_VALUE,6,BIT6,BIT6SLEEP,BIT6_SLEEP
SENDBIT BYTE_VALUE,7,BIT7,BIT7SLEEP,BIT7_SLEEP
SET r30, REG_OUTPUT
SLEEPUSREG BAUD_DURATION,1,STOP1_HIGH
SLEEPUSREG BAUD_DURATION,1,STOP2_HIGH

    RET
       
       
       

EXIT:
        MOV R31.b0, PRU0_ARM_INTERRUPT+16
        HALT


The device tree overlay that I use (this is just based on my limited understanding of the entire device tree system, and it seemed to work, so didn't pursue after that *grin*).

/dts-v1/;

/ {
compatible = "ti,beaglebone", "ti,beaglebone-black";
part-number = "DIYB-PRU";
version = "00A0";
exclusive-use = "P8.12,P8.46";

fragment@0 {
target = <0xdeadbeef>;

__overlay__ {

pinmux_mygpio {
pinctrl-single,pins = <0x30 0x6 0xa4 0x5>;
linux,phandle = <0x1>;
phandle = <0x1>;
};
};
};

fragment@1 {
target = <0xdeadbeef>;

__overlay__ {

helper {
compatible = "bone-pinmux-helper";
pinctrl-names = "default";
pinctrl-0 = <0x1>;
status = "okay";
linux,phandle = <0x2>;
phandle = <0x2>;
};
};
};

fragment@2 {
target = <0xdeadbeef>;

__overlay__ {
status = "okay";
};
};

__symbols__ {
mygpio = "/fragment@0/__overlay__/pinmux_mygpio";
test_helper = "/fragment@1/__overlay__/helper";
};

__fixups__ {
am33xx_pinmux = "/fragment@0:target:0";
ocp = "/fragment@1:target:0";
pruss = "/fragment@2:target:0";
};

__local_fixups__ {
fixup = "/fragment@1/__overlay__/helper:pinctrl-0:0";
};
};



One is never too old to learn

David Pitts

That is perfect Charles. Thanks for all your help. It is much appreciated. I wish I would have had sense to tune in better when you started down the Beagle Bone Black path. :)
PixelController, LLC
PixelController.com

Support FPP

+- Recent Topics

What am I doing wrong? FPP I’m ZCCP? by Oxytousc
Today at 08:35:17 AM

FPP Playlist by dkulp
Today at 07:14:42 AM

Setting up LOR to serial of F16v3 and configure in xLights by brmeadows
Today at 05:46:42 AM

FPP not releasing control to WLED when idle by Poporacer
December 06, 2022, 10:39:18 PM

1 player/2 remote by Poporacer
December 06, 2022, 08:07:48 PM

K8-PB Network Issue by Kensington Graves
December 06, 2022, 07:40:50 PM

Failed V1.03 diff receiver boards by bud29
December 06, 2022, 07:13:38 PM

XLights Variant by Poporacer
December 06, 2022, 06:56:52 PM

FPP Oddness. Remote not remoting, schedule not scheduling??? by Poporacer
December 06, 2022, 06:27:33 PM

Text inverted by dreiman
December 06, 2022, 06:12:17 PM

Powered by EzPortal
Powered by SMFPacks Menu Editor Mod