[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] WEBGL_dynamic_texture redux


Thanks for doing this. Comments inline.

On Wed, Oct 24, 2012 at 12:20 AM, Mark Callow <callow.mark@artspark.co.jp> wrote:


I am updating the WEBGL_dynamic_texture proposal to (a) provide a better interface to the stream producer (HTMLVideoElement, etc.) and (b) provide tools for handling timing and synchronization issues. Rather than writing spec. text I have been playing with sample code to see how various ideas feel. The entire sample program is attached. Please review it and send your feedback. Hopefully the embedded comments and IDL interface definitions give sufficient background for understanding.

(a) stemmed from David Sheets comments to this list requesting the stream interface be added to the producer HTML elements. The sample code offers two alternatives shown in the extract below: augmenting the producer element with the stream interface or keeping it as a separate object.

For (b) I've added query functions based on a monotonically increasing counter to retrieve the current value and to retrieve the value the last time the canvas was presented (updated to the screen).

The first part of the extract shows how the video producer and texture consumer are connected via a new wdtStream interface. The second part, the drawFrame function shows acquire and release of the frames and also how to determine how long it is taking to display the frames, whether any are being missed, etc.

Once we're all happy with this, I'll update the spec. text and then I think we'll be able to move it from proposal to draft.

  // connectVideo
  // Connect video from the passed HTMLVideoElement to the texture
  // currently bound to TEXTURE_EXTERNAL_OES on the active texture
  // unit.
  // First a wdtStream object is created with its consumer set to
  // the texture. Once the video is loaded, it is set as the
  // producer. This could potentially fail, depending on the
  // video format.
  // interface wdtStream {
  //   enum state {
  //     // Consumer connected; waiting for producer to connect
  //     wdtStreamConnecting,
  //     // Producer & consumer connected. No frames yet.
  //     wdtStreamEmpty,
  //     wdtStreamNewFrameAvailable, // when does this state occur?
  //     wdtStreamOldFrameAvailable, // when does this state occur?
  //     wdtStreamDisconnected
  //   };
  //   // Time taken from acquireImage to posting drawing buffer; default 0? // units? microseconds?
  //   readonly int consumerLatency;
  //   // Frame # (aka Media Stream Count) of most recently inserted frame
  //   // Value is 1 at first frame.
  //   readonly int producerFrame;
  //   // MSC of most recently acquired frame.
  //   readonly int consumerFrame;
  //   // timeout for acquireImage; default 0
  //   int acquireTimeout; // units? microseconds?
  //   // readonly int freq; ? do videos get one per channel? only max frequency of all media streams?
  //   void setConsumerLatency(int); // linear with consumerLatency? >0? clamped? how does it exert backpressure on source?
  // };
  function connectVideo(ctx, video) // Whose function is this? Your internal implementation or an abstraction of the proposed interface?
    g.videoReady = false;
    // What is g? gl? seems related to the stream but videoReady is racy
    // Options for connecting to video
    // OPTION 1: method on WDT extension augments video element
    // with a wdtStream object.
    ctx.dte.createStream(video); // What property of video makes this possible? Is that property part of this specification?
    assert(video.wdtStream.state == wdtStreamConnecting);
    // OPTION 2: method returns a stream object.
    g.vstream = ctx.dte.createStream(ctx /* video? empty? */); // Q.vstream = ctx.dte.createStream(video); assert(Q.vstream == g.vstream); ?
    assert(g.vstream.state == wdtStreamConnecting);

    video. color="#008080">function
() {
        g.loadingFiles.splice(g.loadingFiles.indexOf(video), 1);
        try {
          // OPTION 1: video object augmented with stream
          video.wdtStream.connect(); // If the stream is _part_ of video this hardly seems useful to consumers.
          assert(video.wdtStream.state == wdtStreamEmpty);
          // OPTION 2: separate stream object
          g.vstream.connectProducer(video); // What property of video makes this possible? Is that property part of this specification?
          assert(g.stream.state == wdtStreamEmpty);
          if (!video.autoplay) { // is this inverted? NOT autoplay -> play?
            video.play(); // Play video
          g.videoReady = true; // do you mean if g.loadingFiles is length 0?
        } catch (e) {
          window.alert("Video texture setup failed: " + e.name);

  function drawFrame(gl)
    var lastFrame;
    var syncValues;
    var latency;
    var graphicsMSCBase;

    // Make sure the canvas /* buffer? */ is sized correctly.

    // Clear the canvas
    gl.clear(gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);

    // Matrix set-up deleted ...

    // To avoid duplicating everything below for each option, use a
    // temporary variable. This will not be necessary in the final
    // code.
    // OPTION 1: augmented video object
    var vstream = g.video.wdtStream;
    // OPTION 2: separate stream object
    var vstream = g.vstream;

    // In the following
    //   UST is a monotonically increasing counter never adjusted by NTP etc.
    //   The unit is nanoseconds but the frequency of update will vary from
    //   system to system. The average frequency at which the counter is
    //   updated should be 5x the highest MSC frequency supported. For
    //   example if highest MSC is 48kHz (audio) the update frequency
    //   should be 240kHz. Most OSes have this kind of counter available.
    //   MSC is the media stream count. It is incremented once/sample; for
    //   video that means once/frame, for audio once/sample. For graphics,
    //   it is incremented once/screen refresh. // good! on most machines, when I time a rendercycle, I get 60-75 Hz?
    //   CPC is the canvas presentation count. It is incremented once
    //   each time the canvas is presented. // this is totally detached from time, yes?
    if (graphicsMSCBase == undefined{
        graphicsMSCBase = gl.dte.getSyncValues().msc;

    if (lastFrame.msc && vstream.producerFrame > lastFrame.msc + 1) {
      // Missed a frame! Simplify rendering?

    if (!latency.frameCount) {
      // Initialize
      latency.frameCount = 0;
      latency.accumTotal = 0;

    if (lastFrame.ust) {
      syncValues = gl.dte.getSyncValues();
      // interface syncValues {
      //     // UST of last present
      //     readonly attribute long long ust;
      //     // Screen refresh count (aka MSC) at last present
      //     // Initialized to 0 on browser start
      //     readonly attribute long msc;
      //     // Canvas presentation count at last present
      //     // Initialized to 0 at canvas creation.
      //     readonly attribute long cpc;
      // };
      // XXX What happens to cpc when switch to another tab?
      if (syncValues.msc - graphicsMSCBase != syncValues.cpc) // this assumes the media rates are locked to the rendering rates
        // We are not keeping up with screen refresh!
        // Or are we? If cpc increment stops when canvas hidden,
        // will need some way to know canvas was hidden so app
        // won't just assume its not keeping up and therefore
        // adjust its rendering.
        graphicsMSCBase = syncValues.msc; // reset base.
      latency.accumValue += syncValues.ust - lastFrame.ust;
      if (latency.frameCount == 30) { // is this 30 the fps of the encoded video? can it be retrieved from the stream source somehow?
        vstream.setConsumerLatency(latency.accumValue / 30);
        latency.frameCount = 0;
        latency.accumValue = 0;    

    if (g.videoReady) {
      if (g.video.wdtStream.acquireImage()) {
        // Record UST of frame acquisition.
        // No such system function in JS so it is added to extension.
        lastFrame.ust = gl.dte.ustnow();
        lastFrame.msc = vstream.consumerFrame;
      // OPTION 2:
      lastFrame = g.stream.consumerFrame; // lastFrame.msc = ...

    // Draw the cube
    gl.drawElements(gl.TRIANGLES, g.box.numIndices, gl.UNSIGNED_BYTE, 0);

    if (g.videoReady)

    // Show the framerate

    currentAngle += incAngle;
    if (currentAngle > 360)
        currentAngle -= 360;

How many streams may exist for a given media source? If multiple, do they communicate amongst themselves and buffer frames for sharing? If yes, this suggests that streams have a source separate from its sinks. This source must have a property that tracks the maximum consumer latency.

What type of object may be used to generate a stream source?

Some media sources (cameras, live streams), cannot seek into the future. How does an application with multiple sinks attached to these sources synchronize those outputs? Setting all consumer latencies equally?

Can streams be concatenated? Is the result a stream? I don't think this should be part of the API but I think it should be possible to build on top of the API.

Is it possible to construct a stream object from a WebGL renderbuffer? Can a developer do this or must the browser implementor be involved? What is the minimal set of interfaces that is required to give developers this kind of flexibility?