Welcome to part 1 of the practical for the social network analysis.
In this practical we will get hands-on experience with the materials in
the lectures about applying Relational Event Models (REMs) on real
relational event history (REH) data: Class data and World Trade Center
(WTC) by doing a bit of programming in R. These data sets are stored in
UUsummerschool.rdata
. Later in this practical we explain
these data sets briefly.
R can be obtained via the https://www.r-project.org/ webpage. To download R, you need to select your preferred CRAN (Comprehensive R Archive Network) mirror (https://cran.r-project.org/mirrors.html). On the Mirrors webpage, you will find listings of countries that have identical versions of R and should select a location geographically close to your computer’s location. R can be downloaded for Linux, Windows, and Mac OS. The pages are regularly updated and you need to check with releases are supported for your platform. R as a base package can perform many statistical analyses but most importantly, R’s functionality can be expanded by downloading specific packages. After installing R (https://www.r-project.org/), it is quite useful to also install R Studio (https://www.rstudio.com/), which provides a convenient interface to R. Once both are installed, opening up R Studio will give a window that is split into 4 panes:
install.packages()
). R is a command line
driven programme and you can enter commands at the prompt
(>
by default) and each command is executed one at a
time.For the current practical, you will need to install a few packages as follows:
## Install the following package
# install.packages ("relevent")
library (relevent)
library(sna)
library(ggplot2)
library(scales)
library(igraph)
library(RColorBrewer)
As it was mentioned in the lecture, dyadic REMs are intended to capture the behavior of systems in which individual social units (persons, organizations, animals, companies, countries, etc.) direct discrete actions towards other individuals in their environment, which form a social network.
First, we need to load UUsummerschool.rdata
and see what
is inside of it.
# Set your own working directory
# setwd()
# Load data using load():
load("UUsummerschool.rdata")
# This R-data contains 11 objects:
ls()
## [1] "as.sociomatrix.eventlist" "Class"
## [3] "ClassIntercept" "ClassIsFemale"
## [5] "ClassIsTeacher" "PartOfApollo_13"
## [7] "Twitter_data_rem3" "WTCPoliceCalls"
## [9] "WTCPoliceIsICR"
What is stored in UUsummerschool.rdata
?
A function as.sociomatrix.eventlist
that outputs a
sociomatrix, and the rest are data sets such as
PartOfApollo_13
, Class
,
Twitter_data_rem3
and WTCPoliceCalls
.
Probably you would ask: “What is a sociomatrix?” Before you let the computer do the social network analysis, it must have some basic data to calculate, like the connection between whom and whom. We should be aware that any one of those connections involves only two individuals. That is, it involves two nodes. This property might already inspire you to think the relationship as a mathematical expression, for instance a matrix of two dimensions. If we put the elements in the same sequence both at the row and the column, and we use 1 and 0 to represent whether there is a connection between the two elements or not, then we will have a matrix as we expect. The matrix is for the social network and we can give it an exclusive name, sociomatrix.
Secondly, let’s check the objects that we will work with in this practical.
as.sociomatrix.eventlist
## function (eventlist, n)
## {
## g <- matrix(0, n, n)
## if (NROW(eventlist) > 0) {
## tabmat <- table(eventlist[, -1, drop = FALSE])
## g[as.numeric(dimnames(tabmat)[[1]]), as.numeric(dimnames(tabmat)[[2]])] <- tabmat
## }
## g
## }
## attr(,"source")
## [1] "function(eventlist,n){"
## [2] " g<-matrix(0,n,n)"
## [3] " if(NROW(eventlist)>0){"
## [4] " tabmat<-table(eventlist[,-1,drop=FALSE])"
## [5] " g[as.numeric(dimnames(tabmat)[[1]]), as.numeric(dimnames(tabmat)[[2]])]<-tabmat"
## [6] " }"
## [7] " g"
## [8] "}"
head(WTCPoliceCalls)
## number source recipient
## 1 1 16 32
## 2 2 32 16
## 3 3 16 32
## 4 4 16 32
## 5 5 11 32
## 6 6 11 32
head(Class)
## StartTime FromId ToId
## 1 0.135 14 12
## 2 0.270 12 14
## 3 0.405 18 12
## 4 0.540 12 18
## 5 0.675 1 12
## 6 0.810 12 1
relevent
package and rem.dyad
functionThe relevent
package (Butts, 2015) in R provides
implementation of relational event framework, with specialized
functionality for easily fitting dyadic relational event models. Within
the relevent
package, the rem.dyad()
function
is the primary workhorse for modeling dyadic data. As you can see the
related documentation, there are a bunch of arguments that can be used
for different purposes. Here, in this practical, we only use a few of
them (please see the pdf file rem.dyad.args
). For more
explanation please check the help file of rem.dyad
as
follows.
## You can access the help file in three different ways:
# 1)
??rem.dyad
# 2)
help(rem.dyad)
# 3)
?relevent::rem.dyad
There is also rem()
for modeling the dyadic data. The
main difference is that rem()
needs the computed statistics
in advance but rem.dyad()
computes the statistics
internally. However, if we have the value of statistics in advance
rem()
tends to be faster.
As applying the regression model, we should first decide which variables/statistics should be in our model. This can be done in several ways: experts knowledge, prior information and statistics/variables selection e.g., using BIC,etc. Note that not all effects may lead to identified models in all cases. It is up to the user to ensure that the postulated model makes sense.
This data contains information about the 9/11 terrorist attacks at the WTC in New York City in 2001. That should be notified that the radio communication was essential in coordinating activities during the crisis. Butts et al.(2007) coded radio communication events between officers responding to 9/11 from transcripts of communications recorded during the event. It is important to note that the WTC radio data was coded from transcripts that lacked detailed timing information; we do not therefore know precisely when these calls were made. But we know the order of events. Since the case of ordinal timing is somewhat simpler than that of exact timing, we consider the WTC data first in this practical. We will illustrate ordinal time REMs using the 481 communication events from 37 named communicants in this data set.
The eventlist is stored in an object called
WTCPoliceCalls
.
Question 1: Check the data using head
and
tail
function. How many events does the data contain? How
many actors? Without testing anything, can you see any trend? If so,
which statistics can capture that trend?
Hint: Use table()
to generate frequency tables for
source
and recipient
attributes. Use
ggplot(data, aes(attribute))+geom_bar()+labs(y="", x="")
to
visualize the data.
# Check the data.
You can see the reciprocation and recency pattern in the first few events: responding officer 16 called officer 32 in the first event, officer 32 then called 16 back in the second, which might be characterized as a local reciprocity effect or \(AB \rightarrow BA\) participation shift. This was followed by 32 being the target of the next four calls, which is probably due to either a role of 32 such as a coordinator or perhaps the presence of a recency mechanism.
Question 2: Use the included sna
function
as.sociomatrix.eventlist()
to convert the eventlist into a
valued sociomatrix. For the definition of sociomatrix see previous
section.
Hint: as.sociomatrix.eventlist(data, number of actors)
.
# Write your code here.
Question 3: Once you have the sociomatrix object, plot the
aggregated network using gplot()
.
Hint: gplot(created sociomatrix, the arguments such as the size of
nodes, width of edges, colors of nodes, etc.)
gplot
creates a node-and-edge visualization of a network.
It can accept an adjacency matrix (N × N), an array of adjacency
matrices (m × N × N), network objects, and other forms of network data
(check sna
library to see what else is possible).
gplot
has many arguments that you can tweak your network
model visualization, for example:
vertex.col
: set the node color.edge.col
: color for edge border and fill.vertex.sides
: number of sides for node polygon. ex)
triangle = 3, square = 4 (default = 8).label.bg
: color for the background of node.gplot
to find other features.# Write your code here.
The plot above is the time-aggregated WTC Police communication network.
Question 4: igraph
is the most commonly used R
package for network analysis. It provides tools to import, transform,
and visualize graphs. The documentation for igraph
can be
found here: https://igraph.org/r/doc/. igraph
functions
operate on igraph objects, therefore we need to transform our
sociomatrix (also called adjacency matrix) into an igraph object; Use
graph_from_adjacency_matrix()
to convert the WTC Police
Calls sociomatrix into a graph object.
Hint: igraph object is a special class of graphs used by
igraph
. igraph graphs store information about vertices,
edges, and attributes in a format of an edge list. You can read more
about igraph
package using ?igraph
.
# Write your code here.
Question 5: Adding relevant data (attributes) to a graph is
called ‘decorating’ a graph. In igraph
it can be done by
using $
operator. Assign corresponding numbers to nodes as
vertex names. This will make it easier to index nodes while calculating
different statistics.
Hint: You can set vertex attributes by using
V(graph)$name_of_attribute
.
# Write your code here.
Question 6: Use WTCPoliceIsICR vector to assign vertex
attribute that shows whether an actor occupies an institutionalized
coordinator role (ICR).
Hint: V(graph)$name_of_attribute
.
# Write your code here.
Question 7: Using igraph
package visualize the
WTC Police Calls network using different layouts to see which layout is
the most beneficial. First, use a circular layout.
Hint: Use
plot(graph, layout=layout_in_circle, vertex.color=...)
to
plot the graph. You can see different layout options using
?layout_()
.
# Write your code here.
Circular layout is the simplest layout. The vertices are arranged around the circumference of a circle (typically ordered by degree) and edges are then drawn across the circle.
Now, try a Fruchterman and Reingold method.
Hint:
plot(graph, layout=layout_with_fr, vertex.color=...)
.
# Write your code here.
Fruchterman and Reingold method is a spring-embedder method of graph drawing wherein the nodes are analogous to balls and edges to springs between them. The goal is to minimize the energy of the system through moving the nodes and changing the forces between them.
Also try the Kamada and Kawai method.
Hint:
plot(graph, layout=layout_with_kk, vertex.color=...)
.
# Write your code here.
Kamada and Kawai method is an energy-placement method which simulates a spring between each vertex. The length of the spring between the vertices is determined by the shortest path between them. The idea is to move the nodes in such a way as to minimize the global energy of the layout.
Question 8: What are the size and order of this network?
Network’s size refers to the total number of it’s edges. And order
refers to the number of vertices in the network.
Hint: gsize(graph)
, gorder(graph)
.
# Write your code here.
Question 9: Degree of a vertex is a number of edges incident
upon that vertex. A degree distribution is a collection of degree
frequencies over the whole network. The degree distribution shows how
connected the graph is and whether some nodes have a notably distinct
degree compared to others. Using ggplot2
create degree
distributions of the WTC Police Calls data.
Hint: degree(graph, mode=c("in", "out", "total"))
;
ggplot(data, aes()) + geom_histogram() + labs(title="", x="", y="")
.
# Write your code here.
After examining the degree distribution plots, we can see that the majority of the nodes have quite low in-, out-, and total degrees. However, there are two highly connected nodes.
Question 10: Centrality measures help identify ‘important’
nodes in a network. Degree centrality measures how ‘important’ a node is
through measuring the number of edges incident upon that node.
Betweeness measures how often a certain node appears on a shortest path
between pairs of other nodes. Closeness measures how close a certain
vertex is to all other vertices. Eigenvector centrality captures the
‘importance’ of a node through it’s neighbors, therefore the more
connected the vertex’s neighbors are the more ‘important’ is the vertex
itself. Calculate centrality measures: degree, betweeness, and
eigenvector (closeness can’t be calculated for this graph due to the
presence of isolated vertices). Extract nodes that score the highest on
each centrality measure.
Hint:
degree(graph, ..., mode=c("all", "out", "in", "total"))
;
betweenness(graph, directed = TRUE...,)
;
eigen_centrality(graph, directed = TRUE, ...)
.
a) highest degree
# Write your code here.
We can see that in this network, node 32 is the most central node with regard to its in-, out-, and total degree.
b) highest betweenness
# Write your code here.
Node 32 also appears on the shortest path between other nodes the most.
c) highest eigenvector
# Write your code here.
Node 32 also has the most well-connected neighbors. These results show us that node 32 is the most ‘important’ node in this network with regards to degree, betweeness, and eigenvector centralities.
Question 11: Community detection identifies subsets
(clusters) of vertices that demonstrate ‘cohesiveness’ with respect to
their relationship patterns. Clusters correspond to sets of nodes with
more edges within the cluster than the rest of the graph. One of the
ways of approaching community detection in directed graphs, called Naive
approach, is to convert the directed network into an undirected one and
assume edge symmetry. Apply community detection algorithm on
WTCPoliceCalls data.
Hint:
as.undirected(graph, mode = c("collapse", "each", "mutual"))
;
cluster_fast_greedy(graph)
.
# Write your code here.
Once you’ve applied the algorithm, you can look how many communities
the algorithm identified using length()
.
# Write your code here.
Additionally, you can see the number of members of each community
using sizes()
.
# Write your code here.
igraph
has a convenient way of plotting results of
community detection. You can use plot(communities, graph)
to visually represent the results.
# Write your code here.
After plotting the results of community detection, we can see that there are two major communities centered around node 32 and node 16. In addition, all isolated nodes have been assigned to their own clusters.
Hierarchical clustering methods produce a hierarchy of nested
partitions of the graph. We can represent this hierarchy in the form of
a dendrogram. Dendrogram is a tree diagram that represents the
similarity among groups of objects. The heights of the branches signify
how similar the objects are to each other. igraph
has a
function dendPlot(communities, mode='phylo')
specifically
for this.
# Write your code here.
Different communities are color-coded in the dendrogram which makes it clear how the nodes are partitioned into different clusters. Looking at this dendrogram we can see how similar the nodes are to the other members of their communities. It appears that the nodes in the same cluster are quite similar in this network.
The question is whether the propensity of individuals to send and
receive calls depends on whether they occupy institutionalized
coordinator roles (ICR) or not. WTCPoliceIsICR
is a vector
that shows the ICR.
One of the arguments in rem.dyad()
function is
ordinal = TRUE
which, by default, treats the data as
ordinal. Because WTCPoliceCalls
data does not contain exact
timing but contains the order of events we do not need to change
anything. However, later in the practical we will consider data with
exact time stamps which will require us to change the value of this
argument.
Question 12: Use rem.dyad()
by passing
appropriate arguments.
Hint:
rem.dyad(data..., n = number of actors, effects = c("CovInt"), covar = list(CovInt = WTCPoliceIsICR), hessian = ...)
.
Note that CovInt is an effect of the exogenous covariate(s) on the
propensity of actors to initiate and receive communication events.
CovInt requires exogenous covariate(s) to be passed as covariate
arguments. Please check the description of arguments
(?rem.dyad
).
# Write your code here.
Once you have fit the model, you can summarize the model fit using
summary()
.
Question 13: Check the model summary. Then see how much were
not very well predicted by your model using
mean(apply(your model$predicted.match,1,all))
.
predicted.match
provides a two-column matrix of true and
false corresponding to the sender and receiver columns in the main data
frame. It shows whether the model could predict the sender and receiver
truly. apply(your model$predicted.match,1,all)
applies
function all
over every row of this matrix. The function
all
checks whether all elements of a logical vector are
TRUE
(whether sender and receiver were correctly
predicted). mean()
calculates the average value of the
apply()
function output. TRUE
is treated as 1
and FALSE
is treated as 0. Therefore, the result of this
function falls between 0 and 1 - the higher the number, the better the
predictions of our model.
Hint: summary(your model name)
.
# Write your code here.
The output gives us the covariate effect, some uncertainty, and goodness-of-fit information. The format is like the output for a regression model summary, but note that the coefficients should be interpreted considering the relational event framework.
Next, we evaluate whether it is worth treating these effects separately with respect to ICR status. To do so, we enter the ICR covariate as a sender and receiver covariate, respectively.
Question 14: Fit another model adding the covariates and
check the model summary as you did above. See how much were not very
well predicted by your model using
mean(apply(your model$predicted.match,1,all))
and compare
it with previous model prediction.
Hint: It is the same as what we did for Q12. But here we should consider
the effect for the sender and for the receiver separately. The structure
is the same but we need to select CovSnd, and CovRec. CovRec is an
actor-level covariate effect for receiving interactions and CovSnd is an
actor-level covariate effect for initiating interactions with others.
Both effects require exogenous covariate(s) to be passed as covariate
arguments. Check the description of arguments to see what you need to
specify for effects
.
# Write your code here.
The output of 0 means that our second model had 0 better predicted senders and receivers compared to the first model.
Question 15: Now, evaluate which model is preferred by BIC
(lower is better) and explain what the BIC tells us.
Hint: You can extract BIC value from each model by using $
sign (e.g., yourmodel$BIC
).
# Write your code here.
Add the reciprocity term: \(AB \rightarrow BA\). For the definition of reciprocity, see the slides. In particular, the AB-BA shift, which captures the tendency for B to call A, given that A has just called B, is likely at play in radio communication.
Question 16: Fit another model adding the reciprocity term
and check the model summary. See how much were not very well predicted
by your model using
mean(apply(your model$predicted.match,1,all))
and compare it
with previous model prediction.
Hint: Add “PSAB-BA” as another statistic to the best model. Check the
description of arguments to see what you need to specify for
effects
.
# Write your code here.
Question 17: Again, evaluate which model is preferred by BIC.
# Compare the BIC of this model with the first model.
Note that the lower BIC the better the model. It appears that there
is a very strong reciprocity effect. Doesn’t it?
We may expect that the current participants in a communication are
likely to initiate the next call and that one’s most recent
communications may be the most likely to be returned.
These processes can be captured with the participation shifts for dyadic
turn receiving/continuing and recency effects, respectively.
Question 18: Fit another model adding the p-shift effects to
the previous model and check the model summary. See how much were not
very well predicted by your model using
mean(apply(your model$predicted.match,1,all))
and compare
it with previous model prediction.
Hint: Add “PSAB-BY” and “PSAB-AY” to the best model. PSAB-BY is a turn
receiving effect - a participant receiving a communication event is the
one initiating the next communication effect \(AB \rightarrow BY\). “PSAB-AY” is a turn
continuing effect, when a participant that initiated a conversation
event initiates the next conversation event \(AB \rightarrow AY\). For more information
check the description of arguments to see what you need to specify for
effects
.
# Write your code here.
Question 19: Evaluate which model is preferred.
# Compare the BIC of this model with the previous model (third model).
Question 20: Lastly, fit a model adding the recency effects
to the previous model and check the model summary. See how much were not
very well predicted by your model using
mean(apply(your model$predicted.match,1,all))
and compare
it with previous model prediction.
Hint: RRecSnd is a recency effect - the recency of conversation event
\(A \rightarrow B\) affects the future
rate of conversation event \(B \rightarrow
A\). RSndSnd is also a recency effect - the recency of
conversation event \(A \rightarrow B\)
affects the future rate of conversation event \(A \rightarrow B\). Are they the statistics
that we need for this purpose?
# Write your code here.
Question 21: Evaluate which model is preferred.
# Compare the BIC of this model with the previous model (fourth model).
The results indicate that turn-receiving, turn-continuing, and
recency effects are all at play in the relational event process.
Question 22: Can you think of adding more
statistics?
Please add a few more and see how the BIC changes. To this end, check
the definition of statistics precisely.
# Try it out yourself.
Question 23: As there are a lot of statistics that we can use, can you think of some ways for statistics selection?
# Try it out yourself.
Model adequacy is a crucial consideration: even though the model fit
is sufficiently good, you need to think of whether this model is good
enough for our purposes. For example, the ability of the relational
event model to predict the next event in the sequence, given those that
have come before.
Note that we do not discuss this further in this practical but
interested students can check Butts 2008.
The classroom dataset includes timed interactions between students and instructors within a high school classroom (McFarland, 2001; Bender-deMoll and McFarland, 2006). Sender and receiver events (n=691) were recorded between 20 actors (18 students and 2 teachers) along with the time of the events in increments of minutes. Note that the data employed here were modified slightly to increase the amount of time occurring between very closely recorded events, ensuring no simultaneity of events as assumed by the relational event framework. Two actor-level covariates are also at hand in the dataset used here: whether the actor was a teacher and whether the actor was female.
Let’s start by examining the raw temporal data and the
time-aggregated network.
In this case, event time is given in increments of minutes from the
onset of observation.
Question 24: Check the data again using head()
.
Can you see any trend?
Hint: Use table()
to generate frequency tables for sender
and receiver attributes. Use
ggplot(data, aes(attribute))+geom_bar()+labs(y="", x="")
to
visualize the data.
# Check the data.
Question 25: Convert the Class
eventlist into a
valued sociomatrix using as.sociomatrix.eventlist()
as you
did above.
Hint: as.sociomatrix.eventlist(data, number of actors)
.
# Write your code here.
Question 26: Plot the network using
gplot()
.
Here we color the female nodes black, the male nodes gray, represent
teachers as square-shaped nodes and students as triangle-shaped nodes.
Edges between nodes are likewise scaled proportional to the number of
communication events transpiring between actors.
Hint: Please check the hint of Q3 and consult the help file of
gplot
.
# Write your code here.
Question 27: Use graph_from_adjacency_matrix()
to convert the Classroom sociomatrix into a graph object.
Hint: graph_from_adjacency_matrix(sociomatrix)
.
# Write your code here.
Question 28: Assign corresponding numbers as vertex
names.
Hint: You can set vertex attributes by using
V(graph)$name_of_attribute
.
# Write your code here.
Question 29: Use ClassIsFemale and ClassIsTeacher vectors to
assign vertex attributes.
Hint: V(graph)$name_of_attribute
.
# Write your code here.
Question 30: Using igraph
package visualize the
Classroom data using different layouts to see which layout is the most
beneficial. First, use a circular layout.
Hint:
plot(graph, layout=layout_in_circle, vertex.color=...)
.
# Write your code here.
Now, try a Fruchterman and Reingold method.
Hint:
plot(graph, layout=layout_with_fr, vertex.color=...)
.
# Write your code here.
Also try the Kamada and Kawai method.
Hint:
plot(graph, layout=layout_with_kk, vertex.color=...)
.
# Write your code here.
Question 31: What are the size and order of this
network?
Hint: gsize(graph)
, gorder(graph)
.
# Write your code here.
Question 32: Create degree distributions of the Classroom
data.
Hint:
ggplot(data, aes()) + geom_histogram() + labs(title="", x="", y="")
.
# Write your code here.
According to the degree distributions, Classroom network is a well-connected network with a few nodes having noticeably higher degree compared to the rest.
Question 33: Calculate centrality measures: degree,
betweeness, closeness and eigenvector. Extract nodes that score the
highest on each centrality measure.
Hint:
degree(graph, ..., mode=c("all", "out", "in", "total"))
;
betweenness(graph, directed = TRUE, ...)
;
closeness(graph, ...)
;
eigen_centrality(graph, directed = TRUE, ...)
.
a) highest degree
# Write your code here.
Node with highest number of incoming edges is node 4. And the node with the highest out-, and total degrees is node 7.
b) highest betweenness
# Write your code here.
The node that occurs most often on the shortest path between other pairs of nodes is node 7.
c) highest closeness
# Write your code here.
Nodes that are located the clossest to other nodes are 4, 7, and 14.
d) highest eigenvector
# Write your code here.
The node with the most connected neighbours is node 4.
Question 34: Apply community detection algorithm on the
Classroom data. First, use naive approach.
Hint:
as.undirected(graph, mode = c("collapse", "each", "mutual")
;
cluster_fast_greedy(graph)
.
# Write your code here.
Use length()
to check how many communities the algorithm
has identified.
# Write your code here.
Use sizes()
to see how large those communities are.
# Write your code here.
Plot the results of the community detection using
plot(communities, graph)
.
# Write your code here.
Interestingly, the community sizes do not vary a lot. In addition, we can see that there is quite a lot of edges between the communities as well as within them which is not surprising given how well connected this network is.
Create a dendrogram using
dendPlot(communities, mode='phylo')
.
# Write your code here.
According to the dendrogram, even though node 19 is assigned to a cluster, it is quite different from the nodes in it’s cluster. Similarly, node 2 is quite different from the other members of its cluster. This is also seen from the graph representation above, those two nodes are less connected with the other members of their clusters.
We are going to use REMs for event histories with exact timing information.
Question 35: Fit a simple intercept model, containing only a
vector of 1s as an actor-level sending effect. And check the model
summary. See how much were not very well predicted by your model using
mean(apply(your model$predicted.match,1,all))
Hint: This vector is saved as ClassIntercept
, which we can
pass to the respective covariate arguments in rem.dyad()
.
Also note that we do not want to discard timing information
(ordinal=FALSE
).
# Write your code here.
The model does not fit any better than the null, because it is equivalent to the null model (as indicated by the absence of difference between the null and residual deviance).
Question 36: Now fit a more interesting covariate model that
considers gender(male/female) and role(teacher/student) for senders and
receivers. And evaluate whether there is any improvement over the
intercept-only model by BIC. See how much were not very well predicted
by your model using
mean(apply(your model$predicted.match,1,all))
and compare
it with previous model prediction.
# Write your code here.
# Compare the BIC of this model with the previous model.
A good improvement over the null model. But note that gender does not appear to be predictive of sending communication. A better model may be one without that specific term included.
Question 37: Fit another model excluding the non-significant
term and again compare to the previous model by BIC and
prediction.
Hint: Based on the above outputs, which statistics should be
included/excluded?
# Write your code here.
Question 38: Is there any improvement?