Bad experience at OpenPsych journals

OpenPsych journals (for the moment, Open Differential Psychology, ODP; Open Behavioral Genetics, OBG) are new journals (1 year now) that have a policy that allow free open-access publication with open peer-review. Currently, the editors are Emil O. Kirkegaard (ODP) and Davide Piffer (OBG). I believe it is an interesting project, but I happened to be disappointed by the behavior of the main editor, Emil.

Emil is the main reason as for why I divorced from OP. He has published many papers at ODP, but many times he refused to answer my objections, and he has been extremely dishonest in his comments to my first paper at OP journals, so that I decided to withdraw my submission and not to publish at OP again. I was a regular reviewer, and a fast one, but now that I decided not to publish there, I see no more reason to be a reviewer for OP, except perhaps if a paper is extremely important. And I also decided not to be a reviewer again for an author with whom I had big problems before (e.g., nooffensebut, Emil).

Major problems

Lack of professionalism of the author (or reviewer)

1. Parents’ Income is a Poor Predictor of SAT Score [ODP, 2014] (Author : But, Nooffense)

The first time I have seen this is when I was reviewing the paper “Parents’ Income is a Poor Predictor of SAT Score” by nooffensebut (now changed to Nooffense But). In my comments, I told him he was wrong in his interpretation of a multiple regression, as with many other practioners. I referred to this blog post. He replied, but failed to provide convincing evidence for anything he says, and seeing I was pushing the matter, he lost his temper. I have a mixed feeling about how it get published. Emil seemed (but accepted and stated in a more clearer way later) to agree with me about my understanding of multiple regression, and yet he says there is no problem with the article of Nooffense But. I can see only two explanations : 1) Emil may think that the issue I was raising was so provocating that it needed to be fully recognized by the majority of the experts although he said nothing about this, or 2) by disapproving, Emil may think the journal will lose a talented author (and potential reviewer). In any case, if an author behaves like this, anyone can disapprove the paper (which I did). But it would be better to let people know what happened in a more visible way (see below, section “improvement of the system”).

2. Sexual selection explains sex and country differences in fluid g [OBG, 2014] (Author : Piffer, Davide)

Anyone who reads the thread can see that Piffer lost his temper when Peter Frost rejected the paper due to disagreement about the theory of sexual selection. Chuck also had a problem with the cognitive test used by Piffer, and Piffer of course wasn’t happy. I don’t think there is a problem to get angry at something or someone, but only if there is a good reason. Here, I see no good reason.

But that’s not the major problem here. One thing that I realized here is that Piffer asked Kenya Kura to review the paper, since Frost rejected it and Chuck was being persistent about the problem with PISA CPS. This makes me think about something : that an author may ignore the criticisms of some authors and will request, instead, other reviewers in order to get the 3 approvals. Fortunately, Piffer tried to answer the objections. But still, that remains a potential problem and OP has, for now, no ways to deal with it, as Emil seems unwilling to improve the system (see below).

3. Crime, income and employment among immigrant groups in Norway and Finland [ODP, 2014] (Kirkegaard O. W., Emil)

In this comment, I said that Emil’s application of multiple imputation is wrong. Chuck (Fuerst) also believed that his application is not fully satisfying. Yet Emil ignored all of our comments. I have approved (as did Chuck) because it doesn’t change his result. Yet, a wrong application of imputation is still something ugly. I could have disapproved that it wouldn’t have been an outrage.

4. The Elusive X-Factor: A Critique of J. M. Kaplan’s Model of Race and IQ [ODP, 2014] (Author : Dalliard, M.)

It’s an excellent article and I am pleased to see it got some success in the Human Varieties blog, but I got the bad feeling that some people (e.g., Chuck, alias John Fuerst, and Emil) wanted to speed up the review. For instance, about my comments on the “proof” of the Spearman’s hypothesis from MCV studies, Emil says :

Is this discussion really relevcant to the review of Dalliard’s paper? Can we stick to the topic?

How can I not laugh about this stupid comment ? If Emil read the paper, and I believe he did, he would see that Dalliard said many things, and among them, he said that g is an important factor in explaining racial IQ gaps and is not influenced by discrimination and X-factors in general, unlike Kaplan’s claims, and that Kaplan puts emphasis on X factors but ignores completely the relevance of the g factor in racial IQ gaps.

My main problem has to do with the evidence cited by Dalliard with respect to Spearman’s hypothesis. All of these studies rely on MCV, and I pointed the problem of it (Dolan, 2000; Dolan & Hamaker, 2001). MCV can’t test alternative models, and thus hypotheses. Unless you can show that the g model is superior than non-g models, you can’t show that Spearman’s model is to be retained as superior.

One curious thing is that I have debated mainly with Chuck, not Dalliard. The latter appeared somewhat not very talkative but Chuck was very talkative. I would have expected (and hoped) that the reviewers should discuss matters more with the author himself rather than another reviewer.

Another thing is that Dalliard appeared displeased with my persistence about MCV. It’s not as if I have attacked anyone, and yet Dalliard used the term “fetish” against me (that I fetishize model fitting), a word which has generally a very bad, pejorative connotation. If I’m not mistaken, you use that term when you want to make someone looking bad.

Since Chuck understood that it would take some time before the matter is settled (because I’m persistent) and seeing that he couldn’t convince me of the relevance of MCV, he called out Philbrick Bastinado, who just came here, out of nowhere. He asked him if he could approve. The answer was yes, so there were 3 agreements and the paper was approved without me rejecting it nor approving it (of course because no one else wanted to debate any longer with me). But note the significant detail : Chuck didn’t ask Philbrick to review it, he asked him to approve it. That’s what annoyed me the most.

5. An update on the secular narrowing of the black-white gap in the Wordsum vocabulary test (1974-2012) [ODP, 2014] (Author : Hu, M.)

Although it took some time before the reviewers decide to break the silence, everybody left their comment. For what I can see, Emil and Chuck were the ones who made the best remarks.

But everything went sour when Emil started to make a fuss out of nothing, exactly when I uploaded my latest version that contains the multilevel regression analyses and other modifications.

The first complaint was about my R syntax. For example :

m= vglm(wordsum ~ bw1 + agec + agec2 + agec3 + bw1agec + bw1agec2 + bw1agec3, tobit(Upper=10), data=d, weights=d$weight)
xyplot(d$wordsumpredictedage ~ d$age, data=d, groups=bw1, pch=19, type=c("p"), col = c('red', 'blue'), grid=TRUE, ylab="Wordsum predicted by tobit", xlab="age", key=list(text=list(c("Black", "White")), points=list(pch=c(19,19), col=c("red", "blue")), columns=2))

For calculating the Y-hat (i.e., Y predicted by the regression model), I have copied (in fact, copy-pasted) the numbers outputted by my software (Stata12). Emil said that I should “extract the values from the model object” so that I “don’t need to type them manually”. This removes the possibility I’m making a mistake by typing the numbers.

There are several problems however. First, the syntax he suggested didn’t work for me, and we found out together that R is hard to manipulate although we finally understood how to make the syntax. Second, and more annoying, is that I have said that this method is useless if my regression model includes variables such as SES and background variables, since I should exclude them in the computation of Y-hat, as I explained in this blog article and in the forum. Otherwise, here’s what might happen :

How to calculate and use predicted Y-values in multiple regression - 12

The reason is because Y-hat calculates the data points for each individuals in the data. If 10 individuals have same age, IQ, but all have different income levels, I would have 10 data points, but if income is not included I would have only 1 single data point. This is because the data points locate each person for each value of SES/background variables, e.g., income, degree, age, etc., across the values of Y. If we want to graph the changes in Wordsum when controlling for SES, we must not include SES variables. But Emil, stubborn as he is, continued and ignored my comments. The only way, thus, to compute Y-hat in this situation is to do it manually, i.e., by making the sum of the intercept and the variables’ coding values multiplied by their respective parameter values for the independent variables we want to see the variation (in this case, it is cohort or survey year).

Meanwhile, Emil also insisted on an argument he made in this review before. His simulation, in a blog article, in his opinion, shows that when Shapiro-Wilk’s W value is below 0.99, the distribution is necessarily very non-normal (his graphs with W<0.99 always show this). I illustrated how this is wrong by providing this link, where we see that variable1 is perfecly normal given the histogram and Q-Q plot but the W value of S-W is 0.98. Emil misread the article since he said “Their W value for the non­normal distribution is .98.” while in fact the W=0.981 is for variable1 (normal) and W=0.805 is for variable2 (non-normal). In my reply, I have provided another example, where we have W=0.9737 for normal data. Emil replied this is is irrelevant because the sample sizes are too small (N=36 for first link and N=20 for the second link) whereas in Emil’s simulation, the N was always 5000 and almost as large in the GSS samples I used. Thus, he now says that his guideline applies only to large Ns, yet Emil never invoked this condition before, neither in his blog article or anywhere in the forum. It’s only after I showed him that W<0.99 is not necessarily very non-normal. That’s why I told him he was wrong before, yet Emil denied he is wrong, “Not plain wrong, but restricted to large datasets.”, and at the same time ignoring my comment that no conditions were specified before I raised this problem. Because if Emil really thought at the very beginning that his guideline is only effective for large Ns, he should have said it in the blog article. But he couldn’t know whether the guideline is true only for large Ns because it would mean that Emil has already made simulations for small Ns, but he never did. And if he did, he would have proposed a guideline for small Ns and for large Ns (and perhaps medium Ns). So, Emil is clearly lying about not being wrong.

But then comes something hilarious. The beginning of a clown show. To the above comment by Emil (about small Ns), I have looked (quickly) at some GSS variables. Age and cohort variables (N=3676 for both) show a deviation from normality, according to histograms, so slight that most researchers would not consider them as important, and would treat them as approximatively normal. Yet, for age and cohort, W values are 0.970 and 0.984, respectively. This refutes his claim that “Anything below .99 is hopelessly unnormal”. But Emil never gives up :

There is nothing wrong with the .99 guide. Your judgments based on Q­Q plots do not show much. Subjective judgments are bad in science (and everywhere else). If there is an objective way, use that instead.

Ok. Histogram is useless (due to arbitrariness of judgement), but Shapiro-Wilk is super cool because of its complete objectivity. Said it otherwise, if S-W is below 0.99, it’s necessarily normal, regardless of the histograms and Q-Q plots.

Well, I was wondering how someone intelligent like Emil could say such silly things, and contradicting himself so many times. For instance, by requesting so many times (here, here, here, here) that the graphs should be moved up and embedded into the text, but the tables can be left at the very bottom of the paper; I said I dislike doing this because it has no logic and people never do this (the tables and figures are always both embedded into the text or they are both left at the end of the paper). This in itself is a proof that Emil considers graphs (subjective according to Emil) to be more important than the tables displaying the parameter values (non-subjective according to Emil). And yet he said the graphs are bad because of “judgment calls” and other nonsensical bla bla. In one of Emil’s blog article, here, where he made several simulations of S-W test, he used graphs, and under one of these histograms, he said “looks fairly normal, altho too fat” regarding this graph here but also “This one has a longer tail on the right side, but it still looks fairly normal” about this other graph. Thus, I cannot use histogram to determine normality but Emil has the right to do this. It’s impossible for me to believe he is not aware of his own contradictions. The only remaining possibility, was that Emil said things he did not believe himself. He was dishonest with himself. That made me furious, of course. Because he was clearly doing this just to avoid admitting he was wrong.

Emil, obviously, ignores the implication of what he says. If histogram is necessarily helpless due to arbitrariness in judging the data pattern, that means that no one can tell the difference between the following two graphs :

normal distribution 2 3.14.11

histogram non-normality

There is no difference between the two, because of their nature of being a graph. This is so ridiculous that I am lacking superlatives. According to this same logic, one cannot tell whether there is a narrowing of the black-white gap in Wordsum scores in this following graph :

How to calculate and use predicted Y-values in multiple regression - 16

One thing Emil didn’t understand with the graph is that it is just plotting the parameter values. For instance, if we consider the first of the two histograms above, and instead of plotting the values, we simply inscribe them, we obtain something like :

    1 1 1
    1 1 1
  1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1

Or something like :

3 4 6 11 6 4 3

There is no difference between this and the histogram. They describe the same thing, just in different forms. Yet most people will request the graph, since it is easier to read than the above. I think you can understand now why there is no more arbitrariness in a graph than there is in a table filled with numbers. Concerning subjectivity of graphs, I have said in the forum :

By this, you are implying that people cannot read a graph. But everyone can tell the huge difference with high accuracy. I insist on accuracy because you missed that several times. And even your opinion that SW test is an objective test is wrong and contradicted by several of your earlier comments on p­values. You said many times that people use arbitrary cut­off values (0.05) and decide that 0.04 is good but 0.06 is bad, even though there is no way to tell the difference. That is, you understand that judgments based on p values cannot be objective (there is no universal agreement). But the same thing is true for W value. How can you tell that 0.94 is no good but 0.98 is good, for example ? And what is the magnitude of the difference between the two values ?

This is why you have illustrated the W values with histograms in your blog post, because you know fully well that this single number cannot describe accurately the distribution of the data, which can take very complex forms. And it’s too abstract. With histogram, you have the entire picture; i.e., the number of persons in each value of the variable. You make more accurate judgments based on histogram than based on W value, which cannot be understood without graphical description of the data. And you know that, because you used histogram to interpret your W value.

Furthermore, the world is not the caricature Emil was describing :

Yes, there is deviation from normal. That is what W shows. Why are we disagreeing about? First you say .99 doesn’t work even for large datasets. Then you agree that the two variables with .94­.-95 show small to modest deviation from normality. Which way is it?

This is easy to respond, as I did here. There is no such dichotomy as definitive yes and definitive no. There is gradation, however. Perfectly normal, approximately normal, slightly non-normal, non-normal, extremely non-normal. My answer was thus that non-normality is so small that it is of little importance. As usual, Emil ignored my comment.

At the same time, we continued to talk about R syntax, about the possibility that I made a mistake. What is a bother is that he doesn’t trust my syntax while he never wanted to check all my numbers. So, I said :

I don’t know how many times I should repeat it. Your comment assume you don’t trust me, that is, you think I must have necessarily made a mistake somewhere. Why not rerun the syntax, if you want to see ? Like I’ve said, I examined the numbers many times before. Yes. Too many times.

And it’s not a bad code. Perhaps in a subjective way, but not in an objective way. A code is objectively “bad” or “wrong” only if it produces erroneous results. All of my results are correct.

To this comment, Emil answered :

I don’t trust you to have done analyses correctly. It is not personal. I don’t trust anyone to have done analyses correctly, including myself in previous studies (who has not found errors in their previous analyses when they looked through them at a later point?). Science must be open to scrutiny of the methods because everybody makes mistakes.

This is really pathetic. Here’s how I responded :

In that case, you must examine closely the syntax of everyone, or you should not give approvals to anyone. And I proposed you do examine my entire syntax. And if you don’t trust your own studies, that means I shouldn’t have accepted your publications and that I shouldn’t do this with your ongoing and future publications. It’s logical to me that I should reject publication when the author itself doesn’t trust his own result.

What you miss here is that there are two kinds of error. One that does not affect the conclusions of the article, and one that changes the conclusion, (e.g., this).

Indeed, Emil can only annihilate my defense by putting him in a very, very uncomfortable situation.

I have finally asked him if he could have defined his guideline of W<0.99 without graphical representations, e.g., histograms, and re-posts his blog article without the histograms. It is obvious to me that he couldn’t. I have thus speculated that the reasons he used histograms is that because he knew it is not possible to understand normality with just S-W, while we can have a good grasp of the problem with only histograms. Emil himself determined the normality with histograms, after all. And to what I just said now, here’s the only reply that Emil was capable of :

I did of course not say what you claim I said. It is a straw man.

Finally, I am now showing what has really made be extremely furious. Emil started to pretend I made an error in my syntax. He said :

I found some errors in the code. E.g. you had calculated the wrong mean when re­centering. Again, you had inputted values manually instead of using a function. Bad.

The coefficients you typed in are not identical to those in the summary file, perhaps because I fixed the re­centering issue. They are however, very similar.

At this point, there is no indication that Emil is lying. But I told him he made a mistake. Unlike me, he didn’t use sampling weight. My answer reads :

It seems you said it was the value 40.62 which was the problem. Once again, I said I’m not wrong. In my article, I said I have taken the mean age for people having wordsum score and I have also applied sampling weight for all of my analyses. See below :

. summarize wordsum age logincome educ degree year cohort [aweight = weight] if !missing(wordsum)

 Variable | Obs Weight Mean Std. Dev. Min Max
 wordsum | 23817 24541.0623 6.013715 2.102101 0 10
 age | 23817 24541.0623 40.62188 13.87005 18 69
 logincome | 21797 22231.922 10.12833 .9611794 5.501258 11.95208
 educ | 23775 24496.7364 13.09723 2.897454 0 20
 degree | 23779 24501.2668 1.396848 1.142989 0 4
 year | 23817 24541.0623 1992.703 11.16641 1974 2012
 cohort | 23817 24541.0623 1952.081 17.3142 1905 1994

Thus, 1) I said I have used sampling weight and 2) I reported the numbers that are different than what Emil claimed they were. Despite these two details, Emil persisted :

The value you wrote and used is wrong for the data you gave.

 > mean(d$age)
 [1] 41.47897

I am using your syntax and the data you supplied.

But what is disconcerting is that he has quoted my paragraph which says, in the end, that I have used sampling weight (the last sentence being “see below”, i.e., my numbers). Because I did not believe he can miss that again, I have written :

What is bad is to lie, i.e., pretending there is an error when the person know there is no error.

And later, in the same post :

I’m disappointed that you’re lying. It’s dishonest. You know fully well you’re wrong. The last sentence you have quoted says that “I have also applied sampling weight” and I have also provided the weighted numbers from the Stata. It’s impossible you have missed them. You decided to ignore all them because you don’t want to admit in front of everyone else that you’re wrong on everything.

An additional reason that made me believe he was lying concerning the claim I have made an error in my syntax is that he says things he doesn’t believe (using histogram while prohibiting me to do so) and denied arguments that contradict his opinions (his guideline of W<0.99 being rejected).

I do not know why he went so far. Something tells me (but perhaps I’m wrong) that he decided to build up this error because he wanted to show me that the way I write the syntax is no good (i.e., typing all the numbers) and he needed an example to convince me.

At some point, I have said that my R simulation shows that his guideline of W<0.99 is not satisfying. We can try for example :

hist(x, freq=FALSE)

hist(x, breaks=50)

x <‐ rlnorm(5000,3,.22)
hist(x, breaks=50)

The non-normality is just modest, yet W is about 0.95 and 0.96. Well below Emil’s cutoff. But he is still not happy, as he said “that it is decidedly non­normal (very long right tail, cut off left tail)”. Yet the third syntax, shows histograms with very close approximation to normality, and W value of around 0.97. But Emil continued to deny. That is what he said :

Histograms with a proper number of breaks/bins show that it is decidedly non­normal (very long right tail, cut off left tail).

Unfortunately, his comment applied to the second syntax, which displays what is probably a modest deviation from normality. Once more, Emil ignores everything that puts him in an embarrassing situation. I do not believe he has missed the third syntax or that he never tried it. The fact he ignores this tells me that he is lying, once more.

I have ended up having the feeling that Emil thinks, being an editor, that admitting of being wrong in public makes him losing his credibility, or something like this. Especially, the fact of losing to someone like me (an unknown guy with no credentials in this specialized field). I have also been very surprised by Emil’s attachment to Shapiro-Wilk test. The first time he affirmed (first here, and then here) its superiority against histograms, I have said that histogram can show the pattern of the data and, thus, which transformation we need (if it is needed). I was also not convinced by his argument, i.e., inaccuracy of eyeballing and normality assured regardless of N as soon as data is normal. But this is the same problem with p values that I have noted here, and every where else, all the time : significance can show non-normality with large N even if the non-normality is trivial. In any case, I always have the feeling he championed the S-W test because it looks more sophisticated. That is, people will be more impressed by seeing Shapiro-Wilk than histogram.

All this being said, the key thing you need to remember is that when people are afraid of looking bad when losing the debate in public, there is little reason to expect intellectual honesty. And the conflict between Emil and I had some adverse effects on the review of the following paper.

6. Increasing inequality in general intelligence and socioeconomic status as a result of immigration in Denmark 1980-2014 [ODP, 2015] (Author : Kirkegaard O. W., Emil, & Tranberg, Bo)

I am not even sure Tranberg contributed anything important here, and he never commented. That’s why in the forums, I usually preferred to say “Emil did not reply (bla bla bla…)”. Whatever, while things went smoothly at the beginning, Emil stopped to consider my comments after what happened in my Wordsum paper.

I showed that Emil does not understand what a model is. It is an approximation to reality, and as such, a model can’t be a descriptive statistics (e.g., means and averages) like his Figure 6, which corresponds to his so-called model of no IQ gains. I explained and showed why this can’t be said to be a competing model to models of weak, medium and strong gains (which are genuine models). Emil repeated that this definition of a model is just mine, i.e., it’s subjective, “idiosyncratic narrow definition” according to him. In fact, it seems it is Emil who is having the wrong definition, one that is not shared by most researchers and statisticians, and one that only suits him. So, I have made a list of many references, with the appropriate citations. Here’s a sample :

The True Meaning Of Statistical Models

Briggs (2014) has nicely summarized the essence of a typical statistical model : “Why substitute perfectly good reality with a model?”, “Because a statistical model is only interested in quantifying the uncertainty in some observable, given clearly stated evidence”, “Every model (causal or statistical or combination) implies (logically implies) a prediction”. This cannot illustrate better all I have said earlier. A statistical model is an approximation, and thus is different from a descriptive stats. Unfortunately, your so-called statistical model of no gain has no uncertainty in it.

But he didn’t care and said :

If we can get Dalliard, Kura, Meisenberg, and Fuerst or Piffer, then we will have 4 approvals.

That’s my main concern about OP. Some authors have very little incentives to discuss things they can’t deal with. So, they give up, ignore the comment, and ask other reviewers whether they approve.

Meanwhile, I insisted (because I’m usually very persistent, in a way that irritates people), and wrote this comment. But Emil goes on to deny absolutely everything :

The position you ascribe to me is not mine (i.e. straw man).

But here’s what he said before :

One does not need actual comparison data for modeling

And I responded many times that observed data is not modeling and, as I said, I cited many references. Thus, Emil cannot say in this comment here, that “the best fitting model is the no gains one” because a model fit is an approximation to reality (observed data). So, he (and not me) is making a strawman out of my comment.

Few days later, after Kenya Kura’s comment, I have noticed something I have forgotten : the fact that I knew that the recent increase in income inequality is due mainly (if not entirely) to the very top income shares (e.g., 1% or 0.1%), which itself is proved to be due to housing capital. These rich people who own an increasingly larger amount of the national income. So, I explained that his theory is not confirmed by the data because when we ignore the top incomes, there is no increase in income inequality. This refutes the proposed theory because it implies that the recent rise in inequality (owing to, e.g., increased immigration of low IQ people) is due to a shift in % of low IQ (and thus low income) people, while it was something completely unrelated. Another crucial argument is that I assumed that low-IQ immigration is uncorrelated with top income shares and this means that the latter variable is not a confounding factor (i.e., the trend in inequality will remain flat even if “top income shares” variable is removed from the right side (the independent variables) of the regression equation). I have also said that there is a theory (in economics, not psychology) that is fully capable of explaining the changes in income inequalities. It’s the ABCT, as I argued the last year. The drastic and sudden changes (ups and downs) occur precisely according to the economic boom-bust cycles (related to housing and financial bubbles) and eventually to market deregulation during the early 1980s especially for the US and UK. This is not to say that the proposed theory is wrong, and I said it is theoretically unbeatable. If IQ causes SES, SD(IQ) must be causally related to SD(SES). Yet the recent rise in inequality is due to something else.

Probably in order to nullify my comment without counter-argumenting (because he obviously couldn’t), Emil was making a strawman in his reply :

We don’t want to delve deeper into economics territory in which neither author is well-read.

But Emil does not understand economics, which is about theory in the sense of “laws” and “principles”. When I say that the top income shares variable solely explain income inequality, this obviously has nothing to do with economic theory. And there is no need for me to say it explicitly.

But, I continued and explained (and implicitly illustrate) that the matter is about data interpretation. Again, I said that his theory can’t explain the pattern of the data. About my comment, Emil said :

I am not trying to get your approval, so I am ignoring them.

Even if he was still angry at me because of what happened in the review of my own article, he shouldn’t have acted like this because it’s not professional.

Emil persisted in making everyone (including himself) think that I was off-topic and that, consequently, my entire argument makes no sense and should be ignored. Specifically, he says his paper is about psychology, not economics. Indeed, he tried to dismiss in a very clever way my entire argumentation without having the need to reply substantively.

And my reply reads :

I’m not talking about economics here. But data. I asked : How can you explain the pattern of income inequalities over time given your proposed theory ? You didn’t reply, nor did John Fuerst (even though he was the one who initiated the conversation between him and I). … What is of upmost interest here is how you can answer my barrage of questions here. These are not “economic questions” however. These are questions concerning the data.

It seems both of you, Emil and Chuck, you do not understand what is an economic question. Economics is about theory, e.g., law of supply and demand. None of my question purported to such issues. I’m not that stupid.

One misunderstanding is about a paragraph of section 9, where it is acknowledged that inequality has decreased during the 20th century but, then, it started to increase at the mid 1990s. So, I asked why the same factors that caused the decrease in inequality would have stopped or have changed so as to result in an increased inequality. Does that mean that Emil wants to believe that, everything else constant (ceteris paribus), an increase in SD(IQ) will cause an increase in SD(income) ? He only has showed us the observed data (Figure 9). That observed data is not modeling, so the ceteris paribus does not hold here. Thus, that graph tells us nothing about how the trend in inequality would look like, ceteris paribus. Yet, Emil wrote in the paper this astonishing claim at section 9 :

The results of the modeling suggest that social inequality in Denmark is increasing due to the increasing SD of g in the country (and perhaps because of falling average g too). Of course, there are many factors that affect social inequality, and the predicted effect size is probably small, so it may not be visible in actual data yet.

If one looks at income data, then generally inequality has been decreasing since the beginning of the century, but there is an upwards trend in the recent period from perhaps the mid 1980s to now.

There’s finally one little detail. Emil wrote in a footnote :

A reviewer objected that no g gains is compatible with less than 100% heritability if one assumes gene-environment interaction models.

This was written at a time when he and I, we weren’t in conflict concerning the review of my Wordsum paper. I do not know why I didn’t say anything at that time although I knew it was wrong. But I decided to write this comment about my early objection :

This is incorrect. What I was talking about is G-E correlation, not GxE interaction.

And here’s Emil’s correction :

A reviewer objected that no g gains is compatible with less than 100% heritability if one assumes gene-environment interaction/correlation models.

I was extremely irritated, and I left that comment :

I said “gene-environment correlation” but I never said “gene-environment interaction/correlation”.

What happened ? Nothing, because Emil ignored that comment. He pretended I was making a strawman (about the meaning of modeling) while he was the one doing it.

Yet, there’s more to say. One thing that annoyed me is that Chuck was attempting to defend Emil, probably due to conflicts of interests (read further below). Just like Emil, he said I was off-topic, and didn’t understand the purpose of the paper and the main claims of the authors. Chuck’s first comment reads :

I confess that I do not see the relevance of you criticisms. The phenomenon that the authors concern themselves with is intuitively obvious. What the paper adds is that it models the phenomenon and so provides a method for quantifying its social effect. The authors note that social inequality is affected by many factors. And they noted that an immigrant effect can be masked by other social effects. They concern themselves with a hypothetical immigrant effect ceteris paribus. Since they did not attempt to test whether the immigrant effect noticeably contributed the staling of the secular decline in inequality in Denmark, your criticisms seem out of place.

Chuck (here) and I (here) continued, and he made three claims (the third being not very interesting for what I am talking about here).

First is that the authors assume ceteris paribus (which is by definition a model). This is false, as I explained, since they use data, not modeling, to support their claim. I have cited, above, a passage from section 9 which reads “If one looks at income data, then generally inequality has been decreasing since the beginning of the century, but there is an upwards trend in the recent period from perhaps the mid 1980s to now”. It is said that many factors influence inequality and this can make it difficult to reveal the independent effect of SD(IQ) on SD(income). Yet, from the citation, what the “but” is for ? If, as I suspect, that sentence is related to the previous one, it is obvious that the author says that only a segment (1990s to 2010) of the entire period covered (1870-2010) is consistent with the predicted effect, given how the proportion of immigrants increased from 1980s to 2010s in Denmark (see their Figure 1). Furthermore, if Emil believed the data do not provide any enlightenment on the relevance of his proposed theory, I doubt he would even show the graphs. I do not think people who propose a theory as relevant would show data that refute their proposition or do not help : it is only when the data is more supportive to the theory than not that people show the data. Interestingly, in the final version, one can read, below Figures 9-10 : “To be sure, the recent increase in economic inequality may be due to many things that have nothing to do with increased variation in g from immigration. The present study does not attempt to show that the recent increase is due to immigration, and we merely regard the above as circumstantial evidence”. That paragraph has been added in late versions, and after Chuck’s comment about the authors making ceteris paribus assumption. Emil, I suspect, probably wanted to add some nuance to his early interpretation of Figures 9-10. But by doing so, he destroys what remains of the usefulness of section 9, because now the author said openly that this data has no relevance at all for their theory. In any case, the most important thing is the fact that economic variables solely explained all the data available on the changes in income inequality over time. As I have repeated many times in OP forums (e.g., here, here, here, here, here), SD(IQ) is undoubtedly uncorrelated with either capital gains, housing capital, top income shares and business cycles. Thus, doing a bivariate correlation or doing multiple regression (where you partial out all of the theoretically relevant confounding factors) would produce just the same results. Both Chuck and Emil never addressed one of my most critical comments : that their defense (i.e., the ceteris paribus assumption) of the paper’s theory is indefensible. I have said, finally, that their only possible way to salvage their proposition is that the trend (when removing the top income shares) is flat because there are other unobserved, unidentified factors acting to hide the true effect of SD(IQ). But, in this case, what are those factors ? They must detect them, and explain their theoretical relevance as well as empirical relevance. Of course, they didn’t do anything of this sort. This is problematic since when they agreed that if Emil’s hypothesis and model assumed ceteris paribus, the data shown in section 9 is not about modeling but about observed data; that’s why, as I said above, the data does not help to evaluate the relevance of the theory. Yet it seems that, for Emil, if data is consistent with his theory, it’s good and if it’s not that’s not bad either. In other words, it’s like “heads I win, tails I don’t lose”. Is it a fair game ?

Chuck’s second claim was that the authors didn’t mean that SD(IQ) makes for the major (or even huge) variation in SD(income). That is, they made no claim about the effect size. As I said here and here, this is partly wrong, since throughout the paper it is argued that IQ is an important cause of SES (sections 2.3 and 10) and data is used in this regard. Chuck said also that the authors seem to admit that the effect size is probably small (but only at section 9). Of course, I noticed that too. But throughout the entire paper, one has the impression that the authors believe that IQ must have an important role due to its correlation with SES (see notably sections 2.3 and 10). But the most important question is : why are they bothering with “IQ as a cause of socioeconomic inequality” if they actually believe (as the first paragraph of section 9 shows, as well as the comments, e.g., here) that the effect is trivial, i.e., non-important ? This is totally incoherent. Thus, for some reasons, Emil said in the forums that the effect is probably unimportant, but the opposite view seems to be suggested in the paper. I presume this is because he has started to write the paper when believing that SD(IQ) is a non-trivial cause of SD(SES).

In definitive, I think the comments from Emil and Chuck here are filled with lot of bad faith.

Lack of anonymity and conflicts of interests

This is the weakest point. For instance, Fuerst (Chuck), Dalliard, and myself, we are co-bloggers at Human Varieties, originally created by Chuck, although now it’s Jason Malloy who is the administrator. The three of us are reviewers at OP journals. Dalliard and I, we have reviewed some of Chuck’s papers. No one says anything, but I think everyone knows that other people may not necessarily trust our opinions, even if the review is open-access.

Everyone knows who is commenting, and some people, such as Chuck, may fear retaliation if they disapprove (see above; Kirkegaard & Tranberg 2015). This is especially true if the person X is reviewer of person Y and then Y becomes reviewer of author X. One possible way to attenuate (somewhat) this problem is to have reviewers who are not authors at OP and are somewhat independent with regard to the authors. As far as I know, I saw Gerhard Meisenberg, Peter Frost (publishes very little, almost nothing, at OP), and Kenya Kura to a lesser extent. But all of them are friendly (at least not hostile) to the hereditarian position regarding racial differences.

I’m not afraid of retaliation, so I say what I really think, and I kick people’s ass when I think it is needed, whether they are my co-bloggers or not. Perhaps Peter Frost is not much different. But I feel that Chuck is not like me. What I meant is that it may not be always obvious to detect when an opinion should be trusted or not and whether a reviewer is always too generous or just occasionally.

Improvement of the system (and unwillingness to do it)

I will not share all the mails I have with Chuck, but he had a nice idea, as usual with him. He said that even if some reviewers reject the paper, we can still approve it, on the condition that we leave a comment on the paper, about the fact that the author refused to answer the critical comments, or that even if he did, there was no full agreement among all the reviewers. I responded that it’s best to add a link, e.g., “controversies” along with the links to the paper, review, and materials, at the page of the OP paper.

Here’s what Chuck has in mind :

In February 2006, the journal Biology Direct was launched by BioMed Central, providing another alternative to the traditional model of peer review. If authors can find three members of the Editorial Board who will each return a report or will themselves solicit an external review, then the article will be published. As with Philica, reviewers cannot suppress publication, but in contrast to Philica, no reviews are anonymous and no article is published without being reviewed. Authors have the opportunity to withdraw their article, to revise it in response to the reviews, or to publish it without revision. If the authors proceed with publication of their article despite critical comments, readers can clearly see any negative comments along with the names of the reviewers.

There is still one problem I have made clear. It’s the fact that the reviewers are not anonymous and reviewers and authors are not independent. If, say, Chuck has to reject Emil’s paper, he will find it more difficult to get his reviews and approvals for his upcoming papers. And since Chuck has some collective project with Emil as co-author, the conflict of interest is too obvious. One can easily fear this kind of retaliation when the reviews are not anonymous. To make things worse, OP forums contain various sections where one can discuss lot of things related to psychology, psychometrics, behavioral genetics, etc. In that sense, OP really looks like a writers’ club.

I have also said that an additional problem arises as soon as an author cannot reach 3 agreements and decides to ask other (perhaps external) reviewers in the hope of reaching 3 agreements. The author may even decide not to publish reviewers’ comments that reveal the flaws of the papers. For instance, he emails 5 more people, 2 don’t answer the invitation, 3 answered, and among them, 1 accept and 2 reject the publication. Since the policy of OP journals is that everything should be open-access, the fact that some authors request the opinion of reviewers not listed as OP internal reviewers seems to defeat somewhat the principles of OP. Preferably, an author should contact a reviewer who is likely to follow the situation of the paper, probably someone who accepted being a regular OP reviewer, someone active in the list (rather than a ghost reviewer, such as several names listed in OP who never commented to anything).

Our mails were always collective mails; when Chuck sends a mail, it’s to Meng Hu and Emil, and when I did, it’s to Chuck and Emil. And Emil reads these mails, and several of his posts in the OP forums suggest that. Yet these exchanges were from the end of January and beginning of February 2015. As of May-June 2015 now, I see no changes at all, and no comment from Emil.

Minor problems

Time for reviewing a paper

For now, Emil says proudly that OP journals have a much faster time of review than any other journals. This was true at the beginning, when everyone (Emil, Piffer, John Fuerst, Philbrick Bastinado, myself, and then Dalliard) was made available for doing the job (for free of course) on a very regular basis. It was just 3-4 weeks of review, and that occurred quite frequently. Today, things are different. Armstrong left the board of reviewer (but for what his reviews are worthy of…) and Piffer disappeared. Chuck, Dalliard and myself, we leave comments at a much lower pace. For me, of course, the main reason is that I have almost divorced from OP journals. For this reason, the time it takes to publish at OP journals becomes much longer (e.g., around 6 months). I immediately understood this when OP has opened. As I said many times, although always ignored by Emil, the anomaly is the beginning, not now. It was hard to believe that everyone could keep that pace all the time. We were “available” for doing that job because it was a new, starting project, and we were all enthusiastic. But the reviewers and authors were almost always the same persons. I suppose that might trigger a feeling of tiredness. The review is and will remain low if the number of reviewers is not increased. It is only by lowering the burden of work for each individual that the review will become faster.

A curiosity is that some papers, although very short, non-technical and easy to read, take a lot of time before it gets published (e.g., Bakhiet & Lynn’s publications; here or here). I got the feeling it was because no one was motivated, since the time it takes for reviewers to post the comments is too long. Added to this, Bakhiet and Lynn don’t come here at the forums, and Emil behaves as if they will. So, it lengthens the time it takes to review, because these authors probably wait for the editorial board to let them know what’s happening (since this is the common practice in all other journals), and Emil wait for the authors to respond in the forums.

At the same time, some papers were relatively fast to get through the review process. Such as “Discounting IQ’s Relevance to Organizational Behavior: The “Somebody Else’s Problem” in Management Education” by Bryan J. Pesta et al. (2015). One month of review, at a time when the reviewing process was becoming extremely slow (several months). I think the reason is obvious. Pesta is a “renowned” name and none of the reviewers is a renowned author. Pesta is, after all, the guy who published that paper here. And the pro-HBD know about that study, of course. Pesta published many other papers at reputable journals (e.g., Elsevier). Thus, the fastness of the review is all the more interesting since, in my opinion, the content of the paper is probably not a very interesting one. I have read so many papers about the relevance of IQ much more important than this one that I can tell. That’s why I didn’t review it. I have more urgent things to do.

It seems to me that the motivation (and thus, the time it takes for doing a review) depends on things such as the reputation of the author and the subject in question. I knew this potential problem even before it happens, but now, I think I have some sort of illustrations. Of course, the length and complexity of the paper are also influential factors, but these papers were not lengthy or hard to read anyway, unlike that one here.

About motivation, there is the obvious fact, if one observes carefully, that when a reviewer accepts to review (and eventually approve) the paper, he hopes to obtain compensation, i.e., the author will become reviewer for his paper. Now, imagine that persons X and Y make publications about different subjects. X is interested in what Y is doing but Y is not interested in the subjects usually discussed by X. Then, what will happen can be easily predicted. Y will, little by little, loses interest in OP. There is the need to recruit more regular reviewers having different interests. Another related difficulty, it seems to me, is that when a reviewer is not willing to publish at OP, he will have little incentive to be a regular reviewer, especially for a journal that has yet no reputation.

Problem of indexing

OP papers can be indexed only through Google Scholar. So, it is disadvantaged somewhat compared to other, more reputable journals. There are two problems. The first being that when Google Scholar indexes a paper, the name of the journal and publication date do not appear; for OBG at least. I have voiced this problem many times, but Emil responded once, and remained silent after this, and didn’t do anything anyway. Bo Tranberg answered as well, but in a way that tells me he does not understand what I said. Whatever, nothing has changed in the end. So, when Chuck published “Genetic and Environmental Determinants of IQ in Black, White, and Hispanic Americans: A Meta-analysis and New Analysis”, Google Scholar displays :

Fuerst, J. Genetic and Environmental Determinants of IQ in Black, White, and Hispanic Americans: A Meta-analysis and New Analysis.

While it should have been :

Fuerst, J. & Dalliard (2014). Genetic and Environmental Determinants of IQ in Black, White, and Hispanic Americans: A Meta-analysis and New Analysis. Open Behavioral Genetics.

By the way, why Dalliard does not appear among the referred authors ? Here’s the second problem. It seems likely that the reason is due to mononyms. Google Scholar may not accept it. That happened to Dalliard’s paper on the X-factor, and the SAT score paper of nooffensebut. Their paper are not indexed. Not even now, as I write this blog article. Emil changed their names, as they both requested, so that they could be indexed by GS, but still nothing new today. So, it may either mean that 1) I was wrong, 2) it takes a lot of time for the correction to take effect, 3) anyone who publishes by first using mononyms can’t be indexed even after corrections.

Authorized to publish ugly tables and figures

Generally, the tables and figures provided by authors are acceptable. Sometimes more or less acceptable (regarding some of the first publications by Emil) but sometimes unacceptable. Consider the link to this draft of “U.S. Ethnic/Race Differences in Aptitude by Generation: An Exploratory Meta-analysis” and this one of “The Nature of Race”. I copy paste these ugly things here :

ugly tables figures 1

ugly tables figures 2

I selected the worst ones, blurred screenshots, but of course many are not like this. For example, I see often something like :

ugly tables figures 3

One thing that bothers me is that it always comes from the same guy, John Fuerst (Chuck). That’s very irritating. Another thing even more bothersome is that the editors don’t care about it, neither do the other reviewers. Still, for the tables, the editors must require that the authors meet a certain criteria, e.g., no blurred picture, no screenshot, high resolution, etc. That is, like in any other respectable journal. Unless, of course, OP is not aimed to become a respectable journal.

My opinion

OP journal has many good features that makes it superior to other journals. For instance, at OP, the submitted papers would not be rejected for stupid reasons (e.g., censorship) such as what was given to me by J. Intell. at MDPI. At the same time, there is the fact that OP actually is no different than a writer’s club and the fact that the potential conflict of interest is not dealt with. OP’s system can be improved, but there is nothing that tend toward any improvement so far.

For me, I have decided not to publish at OP anymore since the main editor, Emil O.W. Kirkegaard, is not an honest person. This is a very bad publicity for the reputability of the journal. And that’s why I think it is dangerous for me to publish at OP journal, with Emil as editor.

This entry was posted in Miscellaneous and tagged . Bookmark the permalink.