Books / Digital Text
17. The Limits of Numerical Probability
Both Frank H. Knight and Ludwig von Mises are recognized as founders of intellectual traditions: the Chicago School and the neo-Austrian School of economics, respectively. During their lifetime, Knight and Mises were engaged in controversies regarding the nature of socialism and capital.1 My focus here, however, will be on a systematic yet rarely noted similarity in the works of Knight and Mises. In particular, both are representatives of the frequency interpretation of probability and share a similar view concerning the limitations of probability theory in economics and the social sciences generally.2 In the following I will (1) briefly restate the principles of the frequency interpretation of probability; (2) show why Knight and Mises must be considered frequency theorists; and (3) discuss and evaluate the arguments provided by Knight and Mises against the possibility of applying probability theory in the area of economic forecasting (whether on the micro or the macro level).
The principal founder and proponent of the frequency interpretation of probability is Richard von Mises, Ludwig’s younger brother.3 There is no reference in Knight to Richard von Mises, and insofar as Knight’s work of primary interest here is concerned—his 1921 Risk, Uncertainty, and Profit—nothing else can be expected (though Knight read German).4 More surprising is the fact that there is also no mention of Richard von Mises and his frequency interpretation in Ludwig von Mises’s systematic treatment of probability in his 1949 work Human Action: A Treatise on Economics.5 Nonetheless, I assume Richard von Mises’s interpretation as the starting point in the following discussion. It will become apparent that Knight is groping toward the solution provided by Richard von Mises, and that Ludwig von Mises was obviously familiar with his brother’s work and in his own work presents what is meant to be a refinement of the frequency interpretation provided by Richard.
According to Richard von Mises, probability must be defined and the range of applicability of probability theory be delineated thus:
1. It is possible to speak about probabilities only in reference to properly defined collective.6
2. A collective [appropriate for the application of the theory of probability must fulfill] . . . two conditions: (i) the relative frequencies of particular attributes within the collective tend to fixed limits; (ii) these fixed limits are not affected by any place selection. That is to say, if we calculate the relative frequency of some attribute not in the original sequence, but in a partial set, selected according to some fixed rule, then we require that the relative frequency so calculated should tend to the same limit as it does in the original set.7
3. The fulfillment of the condition (ii) will be described as the Principle of Randomness or the Principle of the Impossibility of a Gambling System.8
For the purpose of this article, only three observations regarding Mises’s frequency interpretations are in order. First, there is Mises’s emphatic insistence that the application of the term probability to single events, such as the “probability” of Mr. X dying in the course of the next year, for instance, is “utter nonsense.”9 “The theory of probability can never lead to a definite statement concerning a single event.”10 Second, Mises is equally insistent that the probabilities of the probability calculus are objective, empirical properties and magnitudes (rather than subjective beliefs or degrees of confidence). They are based on experience, and further experience may lead to revised measurements or the reclassification of various singular events into various collectives. However, only in referring to objective probabilities can the probability calculus ever be of any practical use.11 And third and by implication, Mises rejects categorically the notion of a priori probability.12 No such thing as a priori probability exists.13
“In a problem of probability calculus,” according to Richard von Mises, “the data as well as the results are probabilities.”14 “From one or more well-defined collectives, a new collective is derived. . . . The purpose of the theory of probability is to calculate the distribution in the new collective from the known distribution (or distributions) in the initial ones.”15 As in the case of algebra, “[t]here are four, and only four, ways of deriving a collective and all problems treated by the theory of probability can be reduced to a combination of these four fundamental methods.”16 New collectives are derived from known initial ones by means either of selection (unchanged distribution), mixing (addition rule), partition (division rule), and/or combination (multiplication rule).17
As economists, Frank Knight and Ludwig von Mises come upon the subject of probability indirectly, in conjunction with the question concerning the source of entrepreneurial profits and losses. Why, Knight and Mises ask, do profits and losses not disappear as the result of entrepreneurial competition? Why does competition not bring about a state of affairs where the sum of the prices paid for all input factors equals exactly the price of the output, such that the product sum can be apportioned perfectly among its contributing factors?18 Knight and Mises both give the same answer: because of “uncertainty.” Uncertainty concerning the future constellation of demand and supply is the ultimate and ineradicable source of entrepreneurial profit and loss.19 And it is in conjunction with their attempt of explaining the nature of uncertainty, then, that both Knight and Mises introduce the concept of “risk,” as a contingency categorically distinct from uncertainty.20
If all changes were to take place in accordance with invariable and universally known laws, they could be foreseen for an indefinite period in advance of their occurrence, and would not upset the perfect apportionment of product values among contributing agencies, and profit (or loss) would not arise.21
However, perfect foresight need not involve the ability to forecast every singular event and the absence of any kind of contingency (or surprise) for profits and losses to disappear. As Knight explains:
[I]t is unnecessary to perfect, profitless imputation that particular occurrences be foreseeable, if only all the alternative possibilities are known and the probability of the occurrence of each can be accurately ascertained. Even though the business man could not know in advance the results of individual ventures, he could operate and base his competitive offers upon accurate foreknowledge of the future if quantitative knowledge of the probability of every possible outcome can be had. For by figuring on the basis of a large number of ventures (whether of his own business alone or in that of business in general) the losses could be converted into fixed costs.22
[Thus, for example,] the bursting of bottles does not introduce an uncertainty or hazard into the business of producing champagne; since in the operations of any producer a practically constant and known proportion of bottles burst, it does not especially matter even whether the proportion is large or small. The loss becomes a fixed cost. . . . And even if a single producer does not deal with a sufficiently large number of cases of the contingency in question . . . to secure constancy in its effects, the same result may easily be realized, through an organization taking in a large number of producers. This, of course, is the principle of insurance, as familiarly illustrated by the chance of fire loss. No one can say whether a particular building will burn, and most building owners do not operate on a sufficient scale to reduce the loss to constancy. . . . But as is well known, the effect of insurance is to extend this base to cover the operations of a large number of persons and convert the contingency into a fixed cost.23
With this definition of “empirical-statistical probability” as “insurable” contingency or “risk,” Knight is in complete accordance with Richard von Mises’s frequency interpretation. At times, he seems to deviate from Mises’s interpretation, as when he assumes also the possibility of a priori probability (in addition to empirical-statistical probability). But not only does Knight ascribe no importance to a priori probability in the conduct of business, his deviation turns out little more than a minor if unfortunate slip.24 In any case, Knight deserves credit for strictly separating a priori probability from empirical-statistical probability, which alone is of practical importance (and where a priori considerations play no role whatsoever), and in particular for excluding “risk” (insurable contingencies) as a possible source of profit and loss and delineating it strictly from “uncertainty,” as two categorically distinct sorts of contingency.
Ludwig von Mises reaches the same conclusion. Yet writing four decades later, his treatment of the subject, contrary to that of Knight, is in full awareness of Richard von Mises’s frequency interpretation.
Ludwig von Mises first presents a general (wide) definition of probability:
A statement is probable if our knowledge concerning its content is deficient. We do not know everything which would be required for a definite decision between true and not true. But on the other hand, we do know something about it; we are in a position to say more than simply non liquet or ignoramus.25
Within this general category of probabilistic statements, Mises then distinguishes two categorically distinct subclasses. The first one—probability narrowly understood and permitting the application of the probability calculus—is termed “class probability (or frequency probability).”
Class probability means: We know or assume to know, with regard to the problem concerned, everything about the behavior of a whole class of events or phenomena; but about the actual singular events or phenomena we know nothing but that they are elements of this class.26
With this definition of class probability, Ludwig von Mises shows himself in complete agreement with his brother. For him, too, there is no such thing as a priori probability. Nor is there such a thing as the probability of a singular event. Probability statements refer to “objective” probabilities of collectives (classes). They are based on empirical observations. And they are corrigible by such observations. Yet at the same time Ludwig von Mises’s definition of class probability represents an ingenious simplification and refinement of Richard’s frequency interpretation. On the one hand, in requiring that one know (or assume to know) everything regarding the behavior of the whole class, Ludwig von Mises circumvents the difficulties associated with Richard’s notion of a limit and its application to necessarily finite sequences of events. On the other hand, in requiring of every singular event that nothing be known about it except that it is a member of a certain class, Ludwig von Mises eliminates the need for the “randomness” criterion to be added to the definition of a collective suitable to treatment by the probability calculus.27 Ludwig von Mises’s definition of class probability already entails a definition of randomness (and “logical homogeneity”): to state that nothing is known about any particular event except its membership in a joint (common) class of events is to say the same as that—as far as one knows—each particular event is logically “homogeneous” (as far as the risk under consideration is concerned) to every other event and/or that one knows of no law (and consequently no method of place selection) governing the sequence of particular events.28
However, as pervasive as “risks” (insurable contingencies) are and as important as class or frequency probability accordingly may be, Ludwig von Mises concurs with Knight that risks are not the source of entrepreneurial profit and loss. To account for profit and loss another, different sort of contingency (a different sort of “probability”) must be postulated. What, then, is the nature of this contingency that both Knight and Mises consider as falling outside the realm of phenomena tractable by the probability calculus and giving rise to entrepreneurial profit and loss?
Knight terms this other sort of contingency “true uncertainty” and characterizes it thus:
The probability in which the student of business risk is interested is an estimate, . . . an estimate or intuitive judgment is somewhat like a probability judgment, but very different from either of the types of probability judgment already described [a priori and empirical-statistical].29
The theoretical difference between the probability connected with an estimate and that involved in such phenomena as are dealt with by insurance is, however, of the greatest importance. . . . Take as an illustration any typical business decision. A manufacturer is considering the advisability of making a large commitment in increasing the capacity of his works. He “figures” more or less on the proposition, taking account as well as possible of the various factors more or less susceptible of measurement, but the final result is an “estimate” of the probable outcome of any proposed course of action. What is the “probability” of error (strictly, of any assigned degree of error) in the judgment? It is manifestly meaningless to speak of either calculating such a probability a priori or of determining it empirically by studying a large number of instances. The essential and outstanding fact is that the “instance” in question is so entirely unique that there are no others or not a sufficient number to make it possible to tabulate enough like it to form a basis for any inference of value about any real probability in the case we are interested in. The same obviously applies to most of conduct and not to business decisions alone.30
deal with situations which are far too unique, generally speaking, for any sort of statistical tabulation to have any value for guidance. The conception of an objectively measurable probability or chance is simply inapplicable. . . . It is this third type of probability or uncertainty which has been neglected in economic theory, and which we propose to put in its rightful place . . . that higher form of uncertainty not susceptible to measurement and hence elimination. It is this true uncertainty which by preventing the theoretically perfect outworking of the tendencies of competition gives the characteristic form of “enterprise” to economic organization as a whole and accounts for the peculiar income of the entrepreneur.31
Noteworthy about Knight’s argument is his emphasis on unique events. Indeed, if the probability calculus is applicable only to classes or collectives, then it follows logically that it cannot be applied to events which are members of no class (or, as Ludwig von Mises would say, events that form a class by themselves) and are thus unique. However, Knight is less forthcoming as regards the immediately following question: What is it that makes certain events unique, such that they cannot be (or cannot be conceived as being) in a class with other events; and how do we identify and distinguish such events from events which can be classified?32
Ludwig von Mises, writing later and in recognition of his brother’s frequency interpretation, provides further clarification in this regard. According to Ludwig von Mises (and there is little doubt that Knight would have agreed with this), two categorically distinct types of empirical events exist: on the one hand, natural events or what might be called accidents, and on the other hand, human actions. Class (or risk) probability is applicable exclusively to the first type of event, i.e., accidents; and it is impermissible to apply it to human action. Rather, human action is the source of “true,” non-quantifiable (Knightian) uncertainty and responsible for the emergence of profit and loss. States Ludwig von Mises,
[t]here are two entirely different instances of probability; we may call them class probability (or frequency probability) and case probability (or the specific understanding of the sciences of human action).33 The field for the application of the former is the field of the natural sciences, entirely ruled by causality; the field for the application of the latter is the field of the sciences of human action, entirely ruled by teleology.34
Unfortunately, in the relevant chapter VI of his magnum opus Ludwig von Mises is less than outspoken in explaining why human actions (choices) are intractable by probability theory (in the frequency interpretation). His answer can be inferred, however.
The question is: Is it scientifically legitimate to assign quantitative probabilities to the performance of certain actions (whether by individuals or groups of individuals)? Is there a numerical probability that I will watch basketball on TV tonight, that I will spend $5 on beer and $10 on red wine at Von’s grocery store on First Street today, that Hillary Clinton will be elected in 2008, that one million German tourists will spend 3 to 3.5 million Euros on about three million Bratwursts in Mallorca in 2007, that Linda will divorce George, that Ben Bernanke will create 5 billion paper dollars next week, that more people will watch MTV than Fox next Christmas night?
For the frequency theorist, the answer to these questions is a clear no. To be sure, we constantly make predictions concerning action-events such as these, but probability calculations do not—and legitimately cannot—play any role in these predictions.
First of all, the frequency theorist will remind us that the application of the term probability to a single event—and all above-mentioned action-events are single events!—is, in Richard von Mises’s words, “utter nonsense.” “The theory of probability can never lead to a definite statement concerning a single event.” “It is possible to speak about probabilities only in reference to a properly defined collective.” “The definition of probability . . . is only concerned with ‘the probability of encountering a certain attribute in a given collective.’”35
What, then, are the corresponding collectives or classes to which the above mentioned single events belong as members? What, for instance, is the class to which the event “I watch basketball on TV tonight” belongs; what is the collective of which “one million German tourists spend 3 to 3.5 million Euros on about three million Bratwursts in Mallorca in 2007” is a member; and what is the appropriate collective for “Linda divorces George?” Without a specified collective and a (assumedly) full count of its individual members and their various attributes no numerical probability statement is possible (or is, if made, arbitrary).
From a formal-logical point of view, no difficulties arise in meeting such request. For every single event one (or more) corresponding class(es) can be defined. For instance, “I watch basketball on TV tonight” can be considered a member of the class “people watching/not watching basketball on TV tonight” or “American males” doing so. Or it can be considered an element of the class “I watch basketball on TV nightly.” The one million German Bratwurst eaters on Mallorca can be considered a member of the class “annual per capita Bratwurst expenditure by German tourists on Mallorca.” “Linda divorces George” can be an element of “females divorcing/not-divorcing males,” or “Lindas divorcing/not-divorcing Georges,” etc.
However, to have a well-defined—and actually counted and surveyed—collective is only one of the requirements that must be fulfilled to allow the use of numerical probability statements. The second condition to be fulfilled is that of “randomness.” In Richard von Mises’s words, “only such sequences of events or observations, which satisfy the requirements of complete lawlessness or ‘randomness’ [are true] collectives.” In order to employ the probability calculus, it must be impossible to devise “a method of selecting the elements so as to produce a fundamental change in the relative frequencies.”36 “The limiting values of the relative frequencies in a collective must be independent of all possible place selections.”37 Or as Ludwig von Mises expressed the same requirement: for every element of a class it must hold that nothing is known about its attributes under consideration but that it is an element of this class (and that everything is known about the relative frequency of specified attributes for the class as a whole).
It is in connection with this randomness requirement where Ludwig von Mises (and presumably Knight) see insuperable difficulties in applying probability theory to human actions. True, formal-logically for every single action a corresponding collective can be defined. However, ontologically human actions (whether of individuals or groups) cannot be grouped in “true” collectives but must be conceived as unique events. Why? As Ludwig von Mises would presumably reply, the assumption that one know nothing about any particular event except its membership in a known class is false in the case of human actions; or, as Richard von Mises would put it, in the case of human actions we know a “selection rule” the application of which leads to fundamental changes regarding the relative frequency (likelihood) of the attribute in question (thus ruling out the use of the probability calculus).
The randomness (or homogeneity) assumption can be made vis-à-vis events of the accident variety. For instance, we know nothing about the attribute of any particular bottle (will it break or not?) except the bottle’s membership in a class of bottles (of which we know the probability of bottles breaking or not); and we know nothing about the attribute of any particular throw of a die (will it be a six or not?) except the throw’s membership in a class of dice throws (of which we know the probability of throwing sixes).
In the case of human actions this assumption is incorrect, however. In the case of human actions, “we know,” writes Ludwig von Mises, “with regard to a particular event, some of the factors determin[ing] its outcome.”38 Hence, insofar as we know more about a single event than merely its membership in a given class of events of which we know the frequency of certain attributes, we are, with regard to human actions, in a better position to make predictions than we are in the case of “accidents,” where nothing about particular events—one bottle’s vs. another’s breaking—is known.
Whereas natural events—accidents—are occurrences determined by generally, time- and place-invariantly—“blindly” and “indiscriminately”—effective forces within (and constrained by) a “natural environment,” we know action-events to be occurrences determined by individually, at specific times and places held and effective value judgments, knowledge, and property (acting as a constraint). That is, we know that human choices and actions result from individual (subjective and momentary) value judgments; that value judgments involve the ranking of valued ends and the presumed correct knowledge of how to reach these ends through some combination of means; and that the valuation of ends and the selection of means are constrained by the quantity and quality of property (possessions) at an individual human actor’s disposal.39
Based on this general knowledge concerning the nature of human actions as opposed to accidents, then, we are in possession of a method which, according to Richard von Mises’s frequency theory, we are most definitely not allowed to possess if the probability calculus is to be applicable: namely a method of “place selection.” We know of no rule how to distinguish one bottle from another as far as breakage is concerned (otherwise they would not be “classed” together). However, for any presumed collective of action-events (such as “men watch basketball on TV tonight” or “I watch basketball on TV nightly”) we do know of such a rule. We do know of a method of decomposing and de-homogenizing every conceivable action-collective ultimately down to its individual elements (such as “American men, teenagers, I, you, Peter, Paul watch basketball” and “I watch basketball on Monday, Tuesday, Wednesday”). This method of place selection—the possibility of devising a method of selecting the elements so as to produce a fundamental change in the relative frequencies of the attributes in question—is called “Verstehen” (understanding).
Ludwig von Mises characterizes this method thus: Verstehen
deals with the mental activities of men that determine their actions. It deals with the mental processes that result in a definite kind of behavior, with the reactions of the mind to the conditions of the individual’s environment. It deals with something invisible and intangible that cannot be perceived by the methods of the natural sciences. . . .This specific understanding of the sciences of human action aims at establishing the facts that men attach a definite meaning to the state of their environment, that they value this state and, motivated by these judgments of value, resort to definite means in order to preserve or to attain a definite state of affairs different from that which would prevail if they abstained from any purposeful reaction. Understanding deals with judgments of value, with the choice of ends and of the means resorted to for the attainment of these ends, and with the valuation of the outcome of actions performed.
The methods of scientific inquiry are categorically not different from the procedures applied by everybody in his daily mundane comportment. They are merely more refined and as far as possible purified of inconsistencies and contradictions. Understanding is not a method of procedure peculiar only to historians. It is practiced by infants as soon as they outgrow the merely vegetative stage of their first days and weeks. . . .
The concept of understanding was first elaborated by philosophers and historians who wanted to refute the positivists’ disparagement of the methods of history. . . .But the services understanding renders to man in throwing light on the past are only a preliminary stage in the endeavors to anticipate what may happen in the future. . . . Understanding aims at anticipating future conditions as far as they depend on human ideas, valuations, and actions.40
Unfortunately, in his characterization of the method of Verstehen, Ludwig von Mises fails to expressly identify it as a method of place selection, which leaves his analysis of the categorical distinction between case and class probability in a less than satisfactory state. However, this shortcoming can be rectified by adding two closely related observations to his characterization of Verstehen.
First, it must be added to Mises’s characterization that Verstehen is reached, and possibly refined, by means of verbal communication (symbolic interaction), whether actual or virtual,41 with the entity exhibiting (or expected to exhibit) a certain behavior or attribute. From this two further elementary insights regarding the distinction between natural (accident) events and action-events follow.
On the one hand, it follows that we have an access to some entities: human actors, that we do not have to others such as dice, bottles, stones, or the sun. We can communicate with—and hence understand—the former, but not (with) the latter. Accordingly, we can answer questions concerning human actions that are simply unanswerable in the case of natural events. We do not know, and have no way of finding out, why dice, bottles, stones, or the sun behave the way they do. True, we can refer to natural laws in explaining their behavior. But we do not know why these laws are the way they are. They happen to be this way rather than that, and in this sense the behavior of dice, bottles, stones, and the sun is and forever remains unintelligible to us. In contrast, we do know, and have a method of finding out, why human actors behave in the way they do. Actors have reasons for acting the way they do, and we can understand these reasons, thus rendering their actions intelligible events (rather than mere “happenings”).42
On the other hand, whereas entities such as dice, bottles, stones, and the sun offer “equal access” to every observer, i.e., every person is in a position of acquiring the same knowledge of, and reaching the same success in predicting their behavior, such equality is absent in the case of human actions. To be sure, as a matter of empirical fact one person might be more successful than another in predicting the behavior of dice, bottles, stones, and the sun. This may be because one observer possesses cognitive (including mathematical) abilities that another simply does not have or one person has made a new, hitherto unknown discovery. However, in principle, no obstacle stands in the way of anyone learning what another knows or has newly discovered about the behavior of such entities. All knowledge and every new discovery regarding them is public, open, and ready to be acquired by everyone.
In distinct contrast, the access to human actors by means of verbal communication is not equal and public but privileged and private. Each person has a privileged access to himself. That is, in principle each person is better equipped than anyone else in understanding and predicting his very own actions, and especially his immediately impending actions. By the same token, because every actor has privileged access to himself, the access to other actors—what is called Fremdverstehen or the understanding of “strangers”—is private. That is, each “other” or “stranger” may or may not communicate with someone else and reveal more or less about himself. Or put differently, human actors can reveal or keep secrets, and their investigators accordingly may know more or less about the behavior of this rather than that person, while entities such as dice, bottles, stones, and the sun have no secrets to hide from anyone.
Second, these insights regarding the cognitive accessibility of human actors versus non-communicative entities immediately lead to the final and decisive conclusion: that Verstehen via verbal communication represents a unique method of “individualization.” To be sure, in using a system of spatial-temporal coordinates we can always distinguish one die, bottle, or stone from another and likewise one stone-throwing event or one sunrise from another regarding the same stone or sun. But it is precisely our inability of using any other method of individualization that makes it possible to form “collectives” or “classes” of different stones and bottles and of different stone throws and sun rises of one and the same stone and sun. That is, only because we are unable to distinguish one die, bottle, or stone from another and one stone throw or sun rise from another, except through their location in space and time, are we in a position to say, in accordance with Ludwig von Mises’s definition of class probability, that everything is known about the relative frequency of specified attributes for a class as a whole and nothing is known about the behavior of a particular entity but that is a member of this class.
In distinct contrast, in the case of human actors communication offers such other method of individualization. By means of verbal communication, we are in a position to precisely distinguish one actor from any other actor and one action of a given actor from any other following action of the same actor. That is, verbal communication represents a method of synchronic as well as diachronic individualization.
In synchronic perspective, it is impossible to form any actor “collective” made up of Peter, Paul, John, Jim, etc., because it is manifestly untrue to say that we know nothing about their particular actions but that they are the actions of men, American men or American teenage males, for instance, of which we know the relative frequency of some specified attribute such as buying a six-pack of beer today, for instance. We can communicate with Peter, Paul, John, and Jim, and thus find out about Peter’s value judgments, knowledge, and property constraints, Paul’s value judgments, knowledge, and property constraints, John’s, and Jim’s. Each of them is faced by his own property constraints and has his own reasons for acting the way he does.
Likewise, in diachronic perspective it is impossible to form any actor “collective” made up of me and my actions performed over time or of Peter and his actions, because it is also false to claim that I know nothing about my actions or Peter’s actions today, tomorrow, in one week, or one month from now but that they are my actions or Peter’s actions and I know everything about the relative frequency of certain attributes within the class of all of my actions or all of Peter’s actions. To say so is untrue for a twofold reason.
On the one hand, it is untrue because I know more about my actions or Peter’s actions today, tomorrow, in one week, and so on, than that they are my actions or Peter’s actions. I know that my and Peter’s actions today are the result of my and Peter’s present value judgments, knowledge, and property constraints, and that my and Peter’s actions tomorrow or in one week are the result of my and Peter’s future—tomorrow’s and next week’s—value judgments, knowledge, and property constraints. I know further that regardless of the outcome—success or failure—of my and Peter’s present actions, my and Peter’s future value judgments, knowledge, and property constraints will be changed as a result of our present actions, such that my and Peter’s future actions will be performed by a different me and a different Peter under different constraints. Moreover, I know that the change effected in me and in Peter and our circumstances as a result of our present actions cannot be predicted by us in advance but only be reconstructed after the event, thus requiring continuously renewed efforts of Verstehen.43
On the other hand and by the same token, we cannot say that we know nothing about the attributes of a particular event but everything about the relative frequency of the same attributes for the entire class of my and Peter’s actions, because the possible attributes of our actions constitute an “open” or “unending” class. For entities such as dice and bottles, for instance, we know all possible attributes. A throw of a die has six possible outcomes and a bottle can either break or not break. And it is only because the number of possible outcomes is thus “closed,” that the notion of a “sufficiently long” series of observations (Richard von Mises) can assume any operational meaning. Only because the number of possible attributes is definite can we reasonably claim that a series of observations has been “sufficiently long” for all attributes having had a chance of showing up and thus allowing us to calculate the relative frequency of any one of them. However, if man can learn in unforeseeable ways and the potential attributes of his actions are open-ended, then no series of observations can ever be considered “sufficiently long,” and hence, it becomes impossible to calculate the relative frequency of any given attribute within a class of events.
This, then, brings us to our final conclusion. Frank H. Knight and Ludwig von Mises are entirely correct in insisting that the use of numerical probabilities is impossible in our daily endeavors of predicting our own and our fellow men’s actions. As Richard von Mises, the originator of the frequency interpretation of probability, has unambiguously stated: the application of the term probability to a single event is “utter nonsense.” It is possible to speak about numerical probabilities only in reference to a properly defined collective. But ontologically, no such collective exists as far as human actions are concerned. Each human action must be considered a unique event, constituting a class of its own. The method of Verstehen through verbal communication represents a technique of synchronic as well as diachronic individualization. By means of Verstehen each actor (and each group of actors) can be de-homogenized from any other actor (or group) and every given actor (or group of actors) today can be de-homogenized from the same actor (or the same group) tomorrow. Or in the words of Richard von Mises, Verstehen provides us with a “selection rule” which prohibits every use of “relative frequency” statements, because, by definition, relative (numerical) frequencies require a class made up of more than just one element.44
* Originally published in The Quarterly Journal of Austrian Economics 10, no. 1 (Spring 2007).
- 1. Frank H. Knight, “Professor Mises and the Theory of Capital,” Economica, n.s. 8, no. 32 (1941): 409–27; idem, “The Place of Marginalist Economics in a Collectivist System,” American Economic Review: Supplement 26 (1936): 255–66; idem, “Review of Ludwig von Mises, Socialism,” Journal of Political Economy 46 (1938): 267–68; Ludwig von Mises, Human Action: A Treatise on Economics (Chicago: Regnery, 1966 ), pp. 490–93, 848f.
- 2. Interestingly, however influential Knight and Mises otherwise have been in shaping their respective schools, neither Knight nor Mises have been entirely successful in convincing their followers of this part of their doctrines. Similarly, while they were skeptical about the use of probability, Knight and Mises were also proponents of “a priori” economic theory, and in this regard, too, neither Knight nor Mises has been entirely successful with his students. See Knight, “Review of T. W. Hutchison, The Significance and Basic Postulates of Economic Theory,” Journal of Political Economy 48, no. 1 (1940); and Mises, Human Action, chap. 2.
- 3. Richard von Mises (1883–1953) was professor of mathematics at the University of Strassburg (1909–1919). In 1921 he was appointed professor of mathematics and director of the Institute of Applied Mathematics at the University of Berlin. When the National-Socialists dismissed him from this post in 1933, Mises went first to Istanbul, Turkey, and in 1939 he emigrated to the U.S., where he finished his career as Gordon McKay Professor of Aerodynamics and Applied Mathematics at Harvard University. Mises’s groundbreaking works on the foundations of probability theory appeared in 1919 in two issues of the Mathematische Zeltschrift. His main work in this area, originally published in German in 1928, is Probability, Statistics and Truth; see also his Positivism: A Study in Human Understanding (Cambridge, Mass.: Harvard University Press, 1951).
- 4. There are a few references to F. Y. Edgeworth, whose views on probability are rather eclectic.
- 5. The two Mises brothers were long estranged and reconciled only during their common exile in the U.S.
- 6. Richard von Mises, Probability, Statistics and Truth (New York: Dover Publications, 1957), p. 28.
- 7. Richard von Mises, Probability, Statistics and Truth, pp. 28–29.
- 8. Richard von Mises further explains the meaning of condition (ii) (randomness) by means of a contrary example:
Imagine, for instance, a road along which milestones are placed, large ones for whole miles and smaller ones for tenths of a mile. If we walk long enough along this road, calculating the relative frequencies of large stones, the value found in this way will lie around 1/10. . . . The deviations from the value 0.1 will become smaller and smaller as the number of stones passed increases; in other words, the relative frequency tends towards the limiting value 0.1. (Ibid., p. 23)That is, condition (i) is fulfilled. However, absent in this case is condition (ii), because
[t]he sequence of observations of large or small stones differs essentially from the sequence of observations, for instance, of the results of a game of chance, in that the first sequence obeys an easily recognizable law. Exactly every tenth observation leads to the attribute “large,” all others to the attribute “small.” (Ibid., p. 23)The essential difference between the sequence of the results obtained by casting dice and the regular sequence of large and small milestones consists in the possibility of devising a method of selecting the elements so as to produce a fundamental change in the relative frequencies.
We begin, for instance, with a large stone, and register only every second stone passed. The relation of the relative frequencies of small and large stones will now converge toward 1/5 instead of 1/10. . . . The impossibility of affecting the chances of a game by a system of selection, this uselessness of all systems of gambling, is the characteristic and decisive property common to all sequences of observations or mass phenomena which form the proper subject of probability calculus. . . . The limiting values of the relative frequencies in a collective must be independent of all possible place selections. (Ibid., pp. 24–25)
- 9. Ibid., pp. 17–18.
- 10. Ibid., p. 32.
- 11. Regarding subjectivist interpretations of probability, Mises first remarks that subjectivists such as John Maynard Keynes, for instance, fail to recognize “that if we know nothing about a thing, we cannot say anything about its probability,” and he then notes that “[t]he peculiar approach of the subjectivists lies in the fact that they consider ‘I presume that these cases are equally probable’ to be equivalent to ‘These cases are equally probable,’ since, for them, probability is only a subjective notion.” (R. Mises, Probability, Statistics and Truth, pp. 75–76)
- 12. See also footnote 24 below.
- 13. It is frequently held, explains Mises in this connection, that
if one plays with a “perfect” (“correct”) coin heads or tails and makes sufficiently large numbers of throws, it is almost certain that the proportion of heads will deviate by less than 1 per mille from one half of all cases. With regard to this we only note: The transition from the arithmetic proposition to this empirical proposition can be made only in declaring a “perfect” coin to be one for which the probability of both outcomes is ½ and thus defining probability precisely in the way suggested by us, i.e., as relative empirical frequency in long sequences. (Lehrbuch des Positivismus, p. 267)“How is it possible to be sure,” Mises asks the proponents of a priori probability, “that each of the six sides of a die is equally likely to appear. . . . Our answer is of course that we do not actually know this unless the dice . . . have been the subject of sufficiently long series of experiments to demonstrate this fact.” (R. Mises, Probability, Statistics and Truth, p. 71)
- 14. Ibid., p. 33.
- 15. Ibid., p. 37.
- 16. Ibid., p. 38.
- 17. Ibid., p. 57.
- 18. Knight’s and Mises’s views concerning backward imputation (apportioning) differ significantly. In equilibrium, according to Knight each production factor is paid in accordance with its marginal value product, whereas according to Mises each production factor is paid in accordance with its discounted marginal value product, i.e., its marginal value product discounted by the originary rate of interest. This difference does not affect any of the arguments presented here or below, however.
- 19. See section III below.
- 20. Frank H. Knight, Risk, Uncertainty and Profit (Chicago: University of Chicago Press, 1971), chaps. 7 and 8; L. Mises, Human Action, chap. 6, and pp. 289–94.
- 21. Risk, Uncertainty and Profit, p. 198. Similarly, Mises writes:
If everybody is correct in anticipating the future state of the market of a certain commodity, its price and the prices of the complementary factors of production concerned would already today be adjusted to this future state. Neither profit nor loss can emerge for those embarking upon this line of business. (L. Mises, Human Action, p. 290)
- 22. Knight, Risk, Uncertainty and Profit, pp. 198–99.
- 23. Ibid., pp. 212–13. Similarly, see L. Mises, Human Action, pp. 291–92.
- 24. According to Knight,
[t]here are two fundamentally different ways of arriving at the probability judgment of the form that a given numerical proportion of X’s are also Y’s. The first method is by a priori calculation. . . . As an illustration of the first type of probability we may take throwing a perfect die. If the die is really perfect and known to be so, it would be merely ridiculous to undertake to throw it a few hundred thousand times to ascertain the probability of its resting on one face or another. (Knight, Risk, Uncertainty and Profit, pp. 214–15)Richard von Mises’s reply to this definition can be inferred from the quote provided in footnote 13 above: Precisely. But this definition only shows that there is no such thing as a priori probability. Because in order to classify a die as perfect, one must first show this to be true and that cannot be done other than by means of long-run observations.
- 25. L. Mises, Human Action, p. 107.
- 26. Ibid., p. 107. “Let us assume,” Mises further clarifies,
that ten tickets, each bearing the name of a different man, are put into a box. One ticket will be drawn, and the man whose name it bears will be liable to pay 100 dollars. Then an insurer can promise to the loser full indemnification if he is in a position to insure each of the ten for a premium of ten dollars. He will collect 100 dollars and will have to pay the same amount to one of the ten. But if he were to insure only one of them at a rate fixed by the calculus, he would embark not upon an insurance business, but upon gambling. . . . Insurance, whether conducted according to business principles or according to the principle of mutuality, requires the insurance of a whole class or what can be reasonably considered as such. . . . The characteristic mark of insurance is that it deals with the whole class of events. As we pretend to know everything about the behavior of the whole class, there seems to be no specific risk involved in the conduct of the business. . . . Neither is there any specific risk in the business of the keeper of a gambling bank or in the enterprise of a lottery. From the point of view of the lottery enterprise the outcome is predictable, provided that all tickets are sold. If some tickets remain unsold, the enterpriser is in the same position with regard to them as every buyer of a ticket is with regard to the tickets he bought. (Ibid., pp. 108–10)
- 27. See section I.
- 28. Ludwig von Mises was clearly aware of the advantage of this definition. Thus, he notes:
The definition of the essence of class probability as given above is the only logically satisfactory one. It avoids the crude circularity implied in all definitions referring to the equiprobability of possible events. In stating that we know nothing about actual singular events except that they are elements of a class the behavior of which is fully known, this vicious circle is disposed of. Moreover, it is superfluous to add a further condition called the absence of any regularity in the sequence of the singular events. (L. Mises, Human Action, p. 109)
- 29. Knight, Risk, Uncertainty and Profit, pp. 223–24.
- 30. Ibid., p. 226.
- 31. Ibid., pp. 231–32.
- 32. Knight is keenly aware of the unsatisfactory character of his explication of uncertainty-probability (vs. risk-probability). Thus, he notes that “[t]his form of probability is involved in the greatest logical difficulties of all, and no very satisfactory discussion of it can be given, but its distinction from the other types must be emphasized.” (Risk, Uncertainty and Profit, p. 225) Further, “[t]he ultimate logic, or psychology, of these deliberations is obscure, a part of the scientifically unfathomable mystery of life and mind.” (Ibid., p. 227) And yet,
[i]t is indisputable that this procedure is followed in fact to a very large extent and that an astounding number of decisions actually rest upon such a probability judgment, though it cannot be placed in the form of a definite statistical determination. That is, men do form, on the basis of experience, more or less valid opinions as to their own capacity to form correct judgments, and even of the capacities of other men in this regard. (Id., p. 228)
- 33. On case probability see section IV below.
- 34. L. Mises, Human Action, p. 107.
- 35. R. Mises, Probability, Statistics and Truth, pp. 17, 33, 28, 12.
- 36. Ibid., p. 24.
- 37. Ibid., pp. 24–25; see also footnote 8 above.
- 38. L. Mises, Human Action, p. 110, emphasis added. In fact, in some cases we know all of the factors determining its outcome. See footnote 44 below.
- 39. It is true that advocates of the positivist-falsificationist research program deny the categorical distinction drawn here between natural events (accidents) and actions and claim that one and the same methodology applies to both realms of phenomena (monism). According to them, both natural events as well as human actions are to be explained by hypothetically valid (and hence empirically falsifiable) general, time- and place-invariantly effective causes. In both cases, we “explain” by formulating causal hypotheses, which are either confirmed or falsified by actual experiences. However, if actions could indeed be conceived of as governed by time- and place-invariantly operating causes just as natural events are, then it is certainly appropriate to ask: what then about explaining the actions of the explainers, i.e., the causal researchers? They are, after all, the persons who carry on the very process of first formulating causal hypotheses and of then assembling confirming or falsifying experience. In order to assimilate confirming or falsifying experiences—to confirm, revise, or replace his initial hypothesis—the causal researcher must assumedly be able to learn from experience. Every positivist-falsificationist is forced to admit this. Otherwise why engage in causal research at all? However, if one can learn from experience in as yet unknown ways, then one admittedly cannot know at any given point in time what one will know at a later point in time and, accordingly, how one will act on the basis of this later knowledge. One can only reconstruct the “causes” of one’s actions after the event, as one can explain one’s knowledge only after one already possesses it. Indeed, no scientific advance could ever alter the fact that one must regard one’s knowledge and actions based on this knowledge as unpredictable on the basis of constantly operating causes. One might hold this conception of freedom to be an illusion. And this might well be correct from the point of view of a scientist with cognitive powers substantially superior to any human intelligence, or from the point of view of God. But we are not God, and even if our freedom is illusory from His standpoint and our actions follow a predictable path, for us this is a necessary and unavoidable illusion. We cannot predict in advance, on the basis of our previous state of knowledge our future state of knowledge and our actions manifesting this knowledge. We can only reconstruct them after the event. Thus, the positivist-falsificationist methodology is simply contradictory when applied to the field of knowledge and action—which contains knowledge as its necessary ingredient. The positivist-falsificationist who formulates a causal explanation (assuming time- and place-invariantly operating causes) for some action is simply engaged in nonsense. His activity of engaging in an enterprise—research, whose outcome he must admit he cannot know in advance because he must admittedly be able to learn—proves that what he pretends to do cannot be done. See also Hoppe, Kritik der kausalwissenschaftlichen Sozialforschung (Opladen: Westdeutscher, 1987); and idem, Economic Science and the Austrian Method (Auburn, Ala.: Ludwig von Mises Institute, 2007 ).
- 40. Ludwig von Mises, The Ultimate Foundation of Economic Science (Kansas City: Sheed Andrews and McMeel, 1978), pp. 47–49.
- 41. Obviously, one can communicate only with present entities; hence, the distinction between actual and virtual communication. As far as past—and to some extent also distant—entities are concerned, only virtual communication is possible. I cannot engage in actual communication with Caesar, for instance, in order to find out why he crossed the Rubicon. But I can study Caesar’s writings and those of his precursors and contemporaries in order to gain some understanding of his time, his personality, and the situation he faced when he made the decision in question.
- 42. Peter Winch, The Idea of a Social Science and Its Relation to Philosophy (London: Routledge and Kegan Paul, 1970).
- 43. In contrast to the behavior of non-communicative entities, then, which is time-invariant, human actors vary in time and we must communicate with them again and again in order to predict their actions. If man proceeds, as positivists say he does, to interpret a predictive success as a confirmation of his hypothesis such that he would, given the same circumstance, employ the same knowledge in the future, and if he interprets a predictive failure as a falsification such that he would not employ the same but a different hypothesis in the future, he can only do so if he assumes—even if only implicitly—that the behavior of the objects under consideration does not change over the course of time. Otherwise, if their behavior were not assumed to be time-invariant—if the same objects were to behave sometimes this way and at other times in a different way—no conclusion as to what to make of a predictive success or failure would follow. A success would not imply that one’s hypothesis had been temporarily confirmed, and hence, that the same knowledge should be employed in the future. Nor would any predictive failure imply that one should not employ the same hypothesis again under the same circumstances. But this assumption—that the objects of one’s research do not alter their behavior in the course of time—cannot be made with respect to the very subject engaging in research without thereby falling into self-contradiction. For in interpreting his successful predictions as confirmations and his failed predictions as falsifications, the researcher must necessarily assume himself to be a learning subject—someone who can learn about the behavior of objects conceived by him as non-learning objects. Thus, even if everything else may be assumed to have a constant nature, man as a researcher cannot make the same assumption with respect to himself. He must be a different person after each confirmation or falsification than he was before, and it is his nature to be able to change over the course of time. See also footnote 39 above.
Consequently, whereas in the case of non-communicative entities the meaning of predictive success and failure is unambiguous: success means “so far your hypothesis has not been falsified, thus apply it again” and failure means “your hypothesis as it stands is wrong, thus change it,” in the case of human actors the meaning of predictive success and failure is necessarily ambiguous. Because the value judgments, knowledge, and property constraints of a given actor can change in the course of time, we might repeat a specific prediction even if it had proved wrong before or change it even if it had turned out right. That is, we can never rest on our past laurels but must always start again fresh and judge the applicability of our past knowledge anew; and hence, we can never accumulate a stock of knowledge that we may blindly rely upon in the future. See also Hoppe, “On Certainty and Uncertainty,” Review of Austrian Economics 10, no. 1 (1997): 60–61, 73; reprinted chapter 14 herein.
- 44. While Ludwig von Mises is exclusively concerned with probability statements and the categorical distinction between class probability (risk) and case probability (uncertainty), his analysis can be extended to deterministic propositions as well, i.e., to statements regarding which our knowledge concerning their content is not deficient such that we do know everything which would be required for a definite decision between true and not true. In the same way as there exists a categorical distinction between class vs. case probability, so there also exists a categorical distinction between class determinism (event or accident-certainty) vs. case determinism (action-certainty).
For instance (in synchronic perspective), I am certain about what will happen if a stone is thrown into the air: that it will fall to the ground. In fact, every stone will do so, and insofar my certainty extends to every single stone-throwing event. Likewise (in diachronic perspective), I am certain that I will see the sun rise and set in the same constant pattern every day, and insofar my certainty also extends to particular events: to the sun on Monday, Tuesday, Wednesday, etc. However, despite my certainty regarding the outcome of particular events, it still holds true what Ludwig von Mises defines as the characteristicum specificum of class probability: namely, that nothing is known about any particular event except its membership in a given class, while everything is known about the behavior of the whole class of events. The objective probability of the events under consideration, based on long-run frequency observations, is 1; hence, my certainty regarding each singular event. I can be certain regarding each actual, singular event, because I am certain about the behavior of the class, but I have no means to distinguish between the singular events. They are homogeneous as far as the attributes in question are concerned. Each singular event is the outcome of the same general (deterministic) law.
In distinct contrast are the following examples: I am certain that my left arm will rise in a second. I am certain that I will drink a beer tonight. I am certain that I will get out of my bed tomorrow morning. As far as the certainty of these events is concerned, it is no less than that regarding the behavior of stones or the sun. Indeed, one might say that my certainty regarding the former events is even higher than that concerning the latter. After all, the validity of the deterministic laws on which the latter certainty rests is only a hypothetical one, whereas in the former cases it is what one might call a voluntarist-constructivist certainty: I am making the events in question certain; their occurrence depends solely on my will (plus the fact that I am not paralyzed, that I am in possession of a beer, that I own a bed, etc.).
However, as Ludwig von Mises notes regarding probability statements, just as “case probability has nothing in common with class probability but the incompleteness of our knowledge” (Human Action, p. 110), so case determinism (action-certainty) has nothing in common with class determinism (event- or accident-certainty) but the completeness of our knowledge. In every other regard the two are entirely different. For one, whereas I do not know why stones and the sun behave the way they do (I may say that they do so because of the law of gravitation or the Newtonian laws of motion, but there is no further answer, then, as to the question why these laws are the way they are: they are the way they are without anyone understanding why this is so), regarding my own actions (lifting my arm, drinking a beer, getting out of bed) I do know their ultimate cause: they happen, because that is the way I want things to be. Moreover, whereas my certainty regarding the behavior of stones and the sun is based on long-run frequency observations (and the fact that these observations have so far revealed only one and the same result, without any exception), my certainty regarding my arm-lifting, beer-drinking, and getting out of bed is solely based on my present understanding of myself and my present circumstances. However, from my certainty regarding this particular case of arm-lifting, beer-drinking, and getting out of bed nothing follows as regards my future acts of arm-lifting, beer-drinking, and getting out of bed. Rather, any certainty regarding any such future acts of mine must be based on another, future act of understanding myself and my circumstances. In contrast, from my certainty regarding the behavior of one particular stone-throw and the behavior of the sun on Monday it follows that I am just as certain about the result of the next stone-throwing event and the behavior of the sun on Tuesday. (Incidentally, apart from these two types of empirical [a posteriori] certainty, there also exists a third type: of logical and praxeological [a priori] certainty.)