Thursday, June 4, 2015

 operant conditioning:  “the behavior is followed by a consequence, and the nature of the consequence modifies the organisms tendency to repeat the behavior in the future.”

A behavior followed by a reinforcing stimulus results in an increased probability of that behavior occurring in the future.

What if you don’t give the rat any more pellets?  Apparently, he’s no fool, and after a few futile attempts, he stops his bar-pressing behavior.  This is called extinction of the operant behavior.

Schedules of reinforcement

Continuous reinforcement is the original scenario:  Every time that the rat does the behavior (such as pedal-pushing), he gets a rat goodie.

  1. Fixed-ratio schedules are those where a response is reinforced only after a specified number of responses. This schedule produces a high, steady rate of responding with only a brief pause after the delivery of the reinforcer. An example of a fixed-ratio schedule would be delivering a food pellet to a rat after it presses a bar five times. (commission based compensation) 
  2. Variable-ratio schedules occur when a response is reinforced after an unpredictable number of responses. This schedule creates a high steady rate of responding. Gambling and lottery games are good examples of a reward based on a variable ratio schedule. In a lab setting, this might involved delivering food pellets to a rat after one bar press, again after four bar presses, and a third pellet after two bar presses.  (gambling)

  3. Fixed-interval schedules are those where the first response is rewarded only after a specified amount of time has elapsed. This schedule causes high amounts of responding near the end of the interval, but much slower responding immediately after the delivery of the reinforcer. An example of this in a lab setting would be reinforcing a rat with a lab pellet for the first bar press after a 30 second interval has elapsed. (salary based compensation) 
  4. Variable-interval schedules occur when a response is rewarded after an unpredictable amount of time has passed. This schedule produces a slow, steady rate of response. An example of this would be delivering a food pellet to a rat after the first bar press following a one minute interval, another pellet for the first response following a five minute interval, and a third food pellet for the first response following a three minute interval.  (gambling)

shaping, or “the method of successive approximations.”  Basically, it involves first reinforcing a behavior only vaguely similar to the one desired.  Once that is established, you look out for variations that come a little closer to what you want, and so on, until you have the animal performing a behavior that would never show up in ordinary life.  Skinner and his students have been quite successful in teaching simple animals to do some quite extraordinary things. 

systematic desensitization, invented by another behaviorist named Joseph Wolpe.  A person with a phobia -- say of spiders -- would be asked to come up with ten scenarios involving spiders and panic of one degree or another.  The first scenario would be a very mild one -- say seeing a small spider at a great distance outdoors.  The second would be a little more scary, and so on, until the tenth scenario would involve something totally terrifying -- say a tarantula climbing on your face while you’re driving your car at a hundred miles an hour!  The therapist will then teach you how to relax your muscles -- which is incompatible with anxiety.  After you practice that for a few days, you come back and you and the therapist go through your scenarios, one step at a time, making sure you stay relaxed, backing off if necessary, until you can finally imagine the tarantula while remaining perfectly tension-free.

Aversive stimuli

An aversive stimulus is the opposite of a reinforcing stimulus, something we might find unpleasant or painful.
A behavior followed by an aversive stimulus results in a decreased probability of the behavior occurring in the future.
This both defines an aversive stimulus and describes the form of conditioning known as punishment.  If you shock a rat for doing x, it’ll do a lot less of x.  If you spank Johnny for throwing his toys he will throw his toys less and less (maybe).

On the other hand, if you remove an already active aversive stimulus after a rat or Johnny performs a certain behavior, you are doing negative reinforcement.  If you turn off the electricity when the rat stands on his hind legs, he’ll do a lot more standing.  If you stop your perpetually nagging when I finally take out the garbage, I’ll be more likely to take out the garbage (perhaps).  You could say it “feels so good” when the aversive stimulus stops, that this serves as a reinforcer!

Behavior followed by the removal of an aversive stimulus results in an increased probability of that behavior occurring in the future.

Skinner (contrary to some stereotypes that have arisen about behaviorists) doesn’t “approve” of the use of aversive stimuli -- not because of ethics, but because they don’t work well!

Behavior modification

Behavior modification -- often referred to as b-mod -- is the therapy technique based on Skinner’s work.  It is very straight-forward:  Extinguish an undesirable behavior (by removing the reinforcer) and replace it with a desirable behavior by reinforcement

There is an offshoot of b-mod called the token economy.  This is used primarily in institutions such as psychiatric hospitals, juvenile halls, and prisons.  Certain rules are made explicit in the institution, and behaving yourself appropriately is rewarded with tokens -- poker chips, tickets, funny money, recorded notes, etc.  Certain poor behavior is also often followed by a withdrawal of these tokens.  The tokens can be traded in for desirable things such as candy, cigarettes, games, movies, time out of the institution, and so on.  This has been found to be very effective in maintaining order in these often difficult institutions.

Walden II
wherein he describes a utopia-like commune run on his operant principles.

 Beyond Freedom and Dignity.

The bad do bad because the bad is rewarded.  The good do good because the good is rewarded.  There is no true freedom or dignity.  Right now, our reinforcers for good and bad behavior are chaotic and out of our control -- it’s a matter of having good or bad luck with your “choice” of parents, teachers, peers, and other influences.  Let’s instead take control, as a society, and design our culture in such a way that good gets rewarded and bad gets extinguished! With the right behavioral technology, we can design culture.

Both freedom and dignity are examples of what Skinner calls mentalistic constructs -- unobservable and so useless for a scientific psychology.  Other examples include defense mechanisms, the unconscious, archetypes, fictional finalisms, coping strategies, self-actualization, consciousness, even things like hunger and thirst.  The most important example is what he refers to as the homunculus -- Latin for “the little man” -- that supposedly resides inside us and is used to explain our behavior, ideas like soul, mind, ego, will, self, and, of course, personality.
Instead, Skinner recommends that psychologists concentrate on observables, that is, the environment and our behavior in it.

