http://GameProgrammer.Com

Programming

GP Mailing List
     Thread Index
     Date Index

ATXGPSIG List
     Thread Index
     Date Index

Google
>

Home

Wise2Food



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gameprogrammer] Re: Scripting engines: Upvalues and function refs



On Monday 05 April 2004 20.20, Bob Pendleton wrote:
> On Mon, 2004-04-05 at 06:42, David Olofson wrote:
[...]
> > What I mean is just that if I go for the C/C++ variant, I'd like
> > to have the compiler and/or language prevent such situations,
> > preferably without removing other language features.
>
> Good, I follow you. Notice that in C all functions are essentially
> "static" and in C++ you can only get pointers to static methods and
> they can only access static data.

Yes... And some OO languages have a special type for "function of 
object" (Object Pascal, for example), which seems really rather 
similar to a reference to a local function from the implementation 
POV. Both can have implicit or explicit "context management", so I 
guess which way you do it depends a bit on whether the language uses 
GC (or similar) or explicit memory management.


[...]
> Yeah, I read several at one time. I can't think of them right now.
> Look for anything written about Scheme by Guy Steele about the
> original thinking behind scheme will help. One of the motivations
> behind scheme was to support exactly the feature you are looking at
> in a lexically scoped language.
>
> This looks like a good place to start looking:
> http://library.readscheme.org/page1.html
>
> LISP and all of its relatives and variants have this feature.

Ok, I'll have a look at that. (Probably later; trying to get things 
going ASAP here...)


> > > > Anyway, that's a matter of taste, I guess... The main reason
> > > > why I'm doing it this way is that I want the VM to be RT
> > > > safe, without opening up the can of words that is RT safe GC.
> > > > (Maybe later... Right now, I just need this stuff to do the
> > > > job ASAP.)
> > >
> > > There are actually several questions here. One is, how do I do
> > > RT GC and the other is how do I save the semantics of the
> > > language.
> > >
> > > The answer is not as elegant as what I would like, but it
> > > works. When you take a reference to a function you save enough
> > > of the context as is needed to run the function.
> >
> > Right away, as part of the "get funcref" operation?
>
> Yep, right then. That is the only time you know exactly what it is
> supposed to be.

Yes...


> > > That is generally a snap
> > > shot of some portion of the stack. When you call the function
> > > you push the context on the stack call the function, and when
> > > it returns save the context. Then you provide the equivalent of
> > > free() for contexts. That makes clean up the programmers
> > > problem.
> >
> > Yes... However, that makes it very important to grab those
> > function references (and context snapshots) at the exact right
> > moment, so you don't accidentally get a snapshot when the context
> > is in an undesired state. Feels a bit weird...
>
> I thought eating alligator was a bit weird, but it tastes pretty
> good. :-)

I'll have try that some time. (Not that I think it's more weird than 
eating any other animal; we just don't have many alligators around 
here. ;-)


> It is all in what you are used to. I "grew up" with LISP so it
> seems perfectly normal to me.

I grew up with 68k asm (though I started with BASIC, like most home 
computer people at the time) - so *both* ways seem sensible to 
me... :-)

It's just a matter of whether you think of upvalues as accessing a 
parent's "locals", or as implicit arguments. Both ways have their 
uses, but I don't know which one is more useful or which one is less 
dangerous - and I don't think I want both in the same language. (At 
least not with remotely similar syntax...)

How about thinking of functions as classes? The function itself is 
both the class definition and the constructor. (That is, the only way 
to create an instance is to call the function.) Local variables and 
functions are members of the class. A reference to a local function 
is effectively a reference to "function of object". Refcounting, GC 
or similar ensures that instances are kept around until all 
references are gone. A function that returns it's public interface 
(or something) is actually a traditional OO style constructor + class 
definition.

---8<--------------------------------------------------
public function (instance)my_class(argument)
{
	// Private stuff
	some_variable;

	// Interface
	public function set(value)
	{
		some_variable = value;
	}
	public function (value)get
	{
		value = some_variable;
	}

	// Constructor body
	set argument;
	print "Created an instance of my_class.\n";
	instance = context;
}

o = my_class 5;
print "o.get returns ", o.get, "\n";
o.set 42;
print "o.get returns ", o.get, "\n";
o = nil;
-------------------------------------------------->8---


> > OTOH, if you think of the upvalues as a form of implicit
> > arguments (many languages implement them as invisible arguments
> > and similar "hacks"), this actually makes a lot more sense than
> > managing contexts as real (and shared) objects that hang around
> > for as long as you need them.
>
> Six of one half a dozen of the other. Semantically the same.

Except that one treats the calling context like an object passed by 
reference while the other effectively passes the used upvalues by 
value. One allows multiple reference instancnes to share and modify 
upvalues, whereas the other makes upvalues local and/or read-only, 
like normal function arguments.


> The
> thing is that you want to treat the context the same way no matter
> how the function is called. Having two ways of doing it will
> introduce bugs.

Well... What's "the same way", actually? Is it "getting the reference" 
or "calling via the reference" that is supposed to be the same as a 
direct call? :-)


> > The "grabbing a refence to f() always gives you the same thing"
> > logic doesn't apply anyway, so why not break the rules properly
> > while at it? ;-)
>
> Why doesn't it always act the same way?

Well, it always does the same thing, but the behavior of the function 
will depend on the current state when the reference is taken. That 
is, two references to the exact same function may do very different 
things when called with the same (explicit) arguments. Of course, 
that's the idea, but it could also be a bit confusing, as it looks 
like you're just calling a function.


[...]
> > Yes, especially compared to some other solutions for this
> > problem... I'm going to need one eventually anyway, so maybe I
> > should just leave this stuff alone until then. (That gives me the
> > "illegal context - let's crash!" behavior of C/C++, which is ok
> > for now.)
>
> As long as you can be sure that it causes a crash. You don't want
> it to work by accident.

Right. (Though not even that is a showstopper at this point.) What's 
the simplest way of ensuring that a function call with an illegal 
context will always fail?


> > I'm thinking about the basic N pools of power-of-two sized
> > blocks, perhaps with block splitting and merging. That has the
> > disadvantage of restricting the maximum block size to a fraction
> > of the size of the memory pool, but that won't be much of an
> > issue if arrays are fragmented at the VM level. (No need for
> > copying data around when resizing dynamic arrays, which is kinda'
> > nice in an RT system...)
>
> I dislike wasting, on average, a quarter of allocated memory. I
> prefer using a vector of free lists indexed by size with block
> merging at the time when free() is called. Very fast allocations
> and frees with good block merging behavior. And, if the free lists
> are created so that the most recently freed block on the list is
> also the first to be allocated you get good virtual memory and
> cache behavior as well.

That sounds a lot like what I'm thinking about, except I guess you'd 
recommend other than power-of-two block sizes. Would you suggest just 
using other block sizes, or some sort of dynamic selection of block 
sizes?


[...]
> Consider this example. I have 4 variables in the local context, x1,
> y1, x2, y2 and a function drawLine(). Draw line draws a line from
> x1, y1 to x2, y2. So I set those variables to the values I want and
> get a reference to drawLine() and store it in dl1. Now, I set the
> variables to new value and get another reference to drawLine() and
> store it in dl2.
[...]

This requires late binding by name or something like it... So far, all 
binding is done at compile time in my engine, for speed and 
simplicity, but that doesn't mix very well with full dynamic typing. 
(If you look carefully at the example above, you'll notice that it 
won't work unless the "set" and "get" functions are bound at run 
time. "o" is dynamically typed...)

I'm considering just keeping an <index, name> table, where relevant 
names from imported interfaces are added. The table would be used 
when identifier look-ups fail, so that "new" (previously unknown) 
names can be mapped to unique indices when their type is not known at 
compile time.

Example, based on the code above:

	* The definition of my_class adds <0, get> and
	  <1, set> to the name table.

	* When the compiler sees o.get and realizes it
	  doesn't know what type o is (well, it could
	  figure it out in this simple example, but let's
	  ignore that for now), it looks it up in the
	  name table. Thus, o.get evaluates to an element
	  of type NAMEINDEX with a value of 0. This is
	  used as the index argument to get the actual
	  funcref from o.

	* When the resulting code is executed, the VM
	  grabs o (a local variable) and then tries to
	  index it with a NAMEINDEX value of 0. Any
	  object that has something named "get" will
	  return that when name-indexed by 0. Anything
	  else evaluates to nil.


The problem is that this gets ugly if/when objects way down the line 
have fields with names that are already in the table - and spread all 
over it... Making it a hash table seems like the obvious answer, but 
I'm open to suggestions. (Speed is important, but bounded look-up 
time is absolutely critical.)


[...]
> > Yes. I just have to decide what I actually want to achieve. :-)
>
> That is the hardest part :-)

Yeah... *hehe* That's why I'm developing this engine step by step 
while using it for various stuff. (The first generation was the 
source interpretting thingy that runs the AGW scripts in Audiality.)


[...]
> > Yeah... But I need function references. Or maybe I just *think* I
> > do, as a result of hacking too much C, C++, Pascal and the like?
>
> I suppose you can live without it. But, I would hate to have to.
> :-) to many things that are just too easy to implement using
> function pointers.

Yeah, that's what I'm thinking, and I can't really think of a sensible 
alternative.


[...]
> P.S.
>
> At one time I was a compiler writer. I got into graphics by working
> on tools for developing microcode for graphic accelerators.

I think you've mentioned compilers before (or I read it somewhere), so 
I figured there would be at least one person on this list with 
serious experience with this stuff. :-)


//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
|  Free/Open Source audio engine for games and multimedia.  |
| MIDI, modular synthesis, real time effects, scripting,... |
`-----------------------------------> http://audiality.org -'
   --- http://olofson.net --- http://www.reologica.se ---