Negative Binomialverteilung Beispiel Essay

Negative Binomialverteilung mit Excel Beispiel



Wahrscheinlichkeitsverteilung zur Beschreibung derAnzahl erzielter "Nieten" bis zum r-ten "Treffer" bei einem wiederholtenBernoulli Experiment

Im Gegensatz zur Binomialverteilungsteht die Anzahl Erfolge fest und die Anzahl Versuche ist die zu bestimmende Variable.


Der Name kommt von der Darstellbarkeit in Form der Binomialverteilung:  

r: Anzahl der gewünschten Ereignisse ("Treffer")

p: Eintrittsw des gewünschten Ereignisses bei einer Durchführung des Bernoulli Experiments.

F: Verteilungsfunktion, "Wahrscheinlichkeit, nach höchstens x Versuchen genau r Treffer zu haben"

f: Dichtefunktion. "Wahrscheinlichkeit, nach genau x Versuchen genau r Treffer zu haben"


Für r=1 erhält man die geometrische Verteilung.


Für ein Excelbeispiel der Negativen Binomialverteilung siehe hier.

1. Introduction

Free-floating carsharing (FFCS) has become a new mode of transportation in big European and American cities. Users do not need to return their carsharing vehicle to their original location, but can start and finish their trip at any parking lot within the operating area. This flexibility has made FFCS a more attractive option over traditional round-trip models. The aim of this paper is to find significant characteristics of those city districts with a high number of FFCS bookings. The study will also help to identify a typical user of these mobility services for Germany, the case-study country.

Operators usually adopt a piecemeal approach when launching FFCS in new cities. According to interviews with the fleet management teams, the initial operating area is not strictly defined. Instead, operators focus on the city centre and gradually integrate new peripheral districts. A definition of promising districts could, therefore, help the operator decide whether or not to include new city districts into its operating area. With this, customers can then enjoy FFCS services over a wider area. Operators can also shorten vehicle idle times by planning the system to match the existing market demand.

Although spatial analysis is important, assessing the FFCS customer profile is also important. While there is usually no specific information about users, the external census data can be used as a proxy to understand the customer. As Seign demonstrated in his doctoral thesis [1], bookings normally start in areas where users live. Therefore, census data can potentially reveal insights into the average carsharing user. This implies that the work may be considered as a study to ascertain the typical profile of an FFCS user even if external data is used to arrive at this description. Operators can potentially benefit from such customer analyses: they can roll out targeted advertising campaigns and special offers; or cooperate with businesses with similar customer profiles.

One way of understanding the customer may be through surveys, but the information gathered tends to be unsuitable for the purposes of ascertaining the average carsharing profile. While demographic information can be gathered, the only way to establish meaningful relationships between such data would be to ask respondents for their zipcodes. The frequency of FFCS trips may also be difficult to estimate in a survey, which is crucial to differentiating between customer groups. While one might be able to directly ask the respondent how often they use FFCS services and offer specific frequencies for them to select, such self-reporting may not represent actual booking behaviour. It is also difficult to get a representative sample of participants since there is no way to confirm that the sample size will have the same proportions of customer groups and their total number of trips.

To circumvent these problems, this paper focuses on observing actual carsharing trips. Courtesy of the FFCS operator DriveNow, we are able to obtain booking data in Berlin from January to December 2014. This booking data contains essential information such as the start and end locations of each trip, which allows us to illustrate FFCS demand on a spatial level. We then aggregated the booking start points over the grid of external census data that shows the population living in each neighbourhood, and their main activities. Regression models are then used to find the explanatory variables for the target variable: the number of carsharing trip starts per district.

This regression model is not only useful for finding explanatory variables in one city, but could be potentially applied to other cities where one can assume similar customer behaviours. In a sense, then, this paper’s results are not only useful for fine-tuning the operational area of an FFCS service in a case study city, but can also help define potential operational areas for cities that do not currently have such service. Thus, to assess whether the model is transferable to other cities, we used booking and external data from Munich and Cologne, to validate the regression model used in Berlin.

This paper starts by reviewing the literature related to the modeling of carsharing demand. A detailed description of all datasets used follows in Section 3. The negative-binomial model is introduced afterwards in Section 4 before the results are presented and interpreted in Section 5. The paper ends by drawing several general conclusions that can be concluded from this research.

2. Literature Review

In the early 2000s, the term smart city shaped the vision for many cities around the globe [2,3,4]. Information and communications technology (ICT) have also influenced the mobility sector and has made shared mobility services more feasible. Carsharing and bicycle sharing are considered essential contributions to smart mobility solutions [5,6,7,8].

Carsharing services have also evolved with the rise of the mobile internet. Fixed vehicle stations have been rendered obsolete by mobile positioning systems that can provide the location of every vehicle in a city i.e., the so-called free-floating carsharing systems. Customers have found this type of carsharing more attractive because returning to the vehicle pick-up location is no longer mandatory, which made carsharing services serve a wider range of purposes other than the usual round trips.

In most work about carsharing demand modeling, the expected demand is obtained by accessing and reading the FFCS operator’s API (application programming interface). The interface is commonly used by smartphone applications and websites to provide the current distribution of available cars in the fleet. Such booking data, however, should be treated with caution. The civity study by Brockmeyer et al. [9] used this method to collect the booking data of FFCS operators in Berlin. Since they could only observe the (non-)availability of a vehicle on the map, they could not distinguish between service and customer trips. Instead of recording the amount of time a vehicle was used at around 3–4 h per day, which is demonstrated by Lenz and Bogenberger [10], they observed a time of 62 minutes. This implies that API data may be full of errors. Weigele, a co-author of the civity study, later assumed that there were some errors in the methodology, such as overestimating the assumed booking time [11]. Other studies like [12] took this data to measure the influence of points of interest (POI) on the number of bookings. The authors aggregated these datasets over a base grid consisting of squares with an edge length of 100 meters. In their chosen zero-inflated Poisson regression model, the bookings were taken as the dependent variable and the density of the several POIs as the independent variable. The zero-inflated model design excluded those cells that did not show any bookings, such as parks and other parking-prohibited areas. The significant variables with a positive influence on the number of bookings were, for example, bars, restaurants, the airport and areas where residents earn less than 500 EUR per month. A negative correlation was observed in regions with a highly educated population. Lenz and Bogenberger [10] also identified through customer surveys that the project WiMobil had well-educated men averaging 33 years old as typical users.

The first analysis of FFCS bookings was done by Kortum and Machemehl in 2012 [13]. The evaluated data of car2go in Austin showed a high acceptance and use of the system in areas with a high population and household density. A high percentage of citizens between 20 and 39 years old, as well as students or government workers, had also a positive effect on the number of bookings. The last factor could be explained by the fact that many government agencies reduced their own fleet of cars, and provided their employees discounted rates for FFCS.

Most of the literature that analyzes user groups of carsharing systems are related to station-based carsharing systems. A study from De Lorimier and El-Geneidy [14] for Montréal’s station-based communauto tried to explain varying booking demands. The authors applied a multilevel regression analysis and showed that vehicle age, the concentration of users within a specific geographic region, and the vicinity of stations are important factors for high vehicle usage. Applying an analogous model for a station-based system in Seoul, Kang et al. [15] identified a high density of business offices and a high density of people aged between 20 and 30 to be positively correlated with carsharing demand.

However, for understanding and predicting the use of FFCS, it is necessary to create a more comprehensive customer profile. A classic way to characterize typical customers and their mobility behavior is to use surveys, which can help find attributes of an average user or groups of users who are more inclined to carshare. Among other studies, Cervero’s characterizations of station-based carsharing users from 2001 [16] and 2002 [17] are among the most well-known early works in carsharing demand research. In his surveys, more than 62% of the carsharing users were female, and the average yearly income of the users was about 50,000$ which is an over-average income. The study also found that the carsharing system was mainly used during the afternoon for non-work purposes. It also noted, interestingly, that one-third of carsharing users lived alone, and every fourth shared their home with non-related adults. Cervero called them the “non-traditional” households [17].

Morency et al. also identified gender and age as characteristics having a significant impact on carsharing behaviour [18]. They also found out that user behavior in the previous four months directly influences the current frequency of usage. Kawgan-Kagan, focusing on gender, revealed that female early adopters generally show a higher affinity for bikes and a lower open-mindedness towards new technologies in comparison to male users [19].

In another study by Celsor and Millard-Ball [20], the authors emphasize the importance of the users’ neighborhoods. They distilled the results from other researchers and listed four factors: parking pressure, the ability to live without a car, high population density and the mix of uses of a district.

Stillwater et al. analyzed the dependency of carsharing on public transport. Whereas a neighborhood with a light rail station had a positive impact on the demand of carsharing, regional rail availability decreased the number of bookings [21]. An overview of relevant studies from 1989–2013 on carsharing target groups was put together by Hinkeldein et al. in [22], pp. 182–186. The listed research analyzed factors like mobility-related attitudes, lifestyle, family status and leisure activities. A literature review about general approaches to model carsharing demand was published by Jorge and Correia [23].

0 thoughts on “Negative Binomialverteilung Beispiel Essay

Leave a Reply

Your email address will not be published. Required fields are marked *